Three-dimensional digital library system

ABSTRACT

A computer system ( 100 ) and method for the storage, archiving, query and retrieval of information relating to 3D objects is provided. The système includes data acquisition ( 130 ) means for requiring point coordinate data about a three-dimensional object, a database component ( 105 ), a processor ( 103 ) and a user interface ( 110 ). The processor ( 103 ) is operable to generate modeled data ( 105 ) from the point coordinate data and to segment the modeled data ( 124 ) into feature data ( 122 ) representing a plurality of features of the object ( 118 ). The data is organized so that features of the 3D objects can be automatically extracted for online query and retrieval. The processor ( 103 ) is operable to store the modeled data ( 124 ) and the feature data ( 122 ) in the database component ( 105 ) and retrieve modeled data ( 124 ) and feature data ( 122 ) from the database component ( 105 ) using search criteria representing oobject features of interest. The user interface ( 110 ) is operative with the processor ( 103 ) to allow a user input search criteria to processor ( 103 ) and to display data retrieved by the processor as a representation of an object feature.

RELATED APPLICATION

This application claims the benefit of U.S. Provisional PatentApplication No. 60/370,507, filed Apr. 4, 2002, entitled “3D DigitalLibrary System,” which is incorporated herein by reference.

U.S. GOVERNMENT FINANCIAL ASSISTANCE

Financial assistance for this project was provided by the United StatesGovernment, DARPA MDA972-00-1-0027 and NSF IIS9980166. The United StatesGovernment may own certain rights to this invention.

BACKGROUND

This invention relates generally to information storage and retrieval.More specifically, it relates to a system and method for the storing,archiving, query and analysis of information relating tothree-dimensional (3D) materials and objects. The system can be usedwithin a distributed environment to catalog, organize and supportinteraction with 3D materials and objects at a feature-based level.

The understanding of 3D structures is essential to many disciplines,including scientific disciplines such as archaeology and physicalanthropology. For example, (1) archaeologists study the 3D form ofNative American pottery to characterize the development of prehistoriccultures; (2) lithic research seeks to understand the development ofstone tool technology over time by conjoining the remains of individuallithic reduction sequences; (3) comparative quantitative analysis of the3D joint surfaces of human and nonhuman primate bones and the effects ofsurface area, curvature, and congruency on the mechanics of a joint canfurther the understanding of physical anthropologists who focus on thereconstruction of manipulative and locomotor behavior of early humans.To date, the methods that anthropologists use to measure, record, andclassify 3D data such as calipers, photographs, drawings, and the use ofclassification schemes that reduce objects to simplified descriptors areinadequate for accurately representing such complex data.

Within the past decade, technologies for capturing 3D objects haveemerged and improved dramatically, offering the possibility to obtainand store accurate mathematical models of objects in digital form.Notwithstanding this progress, heretofore no system has provided 3D datarepresentation, storage, retrieval, semantic enrichment,application-specific feature extraction and development of a visualquery system in a distributed environment.

SUMMARY

In accordance with the invention, there is provided a computer systemand method for the storage, archiving, query and retrieval ofinformation relating to 3D objects. The system includes data acquisitionmeans for acquiring point coordinate data about a three-dimensionalobject, a database component, a processor and a user interface. Theprocessor is operable to generate modeled data from the point coordinatedata and to segment the modeled data into feature data representing aplurality of features of the object. The data is organized so thatfeatures of the 3D objects can be automatically extracted for onlinequery and retrieval. The processor is operable to store the modeled dataand the feature data in the database component and retrieve modeled dataand feature data from the database component using search criteriarepresenting object features of interest. The user interface isoperative with the processor to allow a user input search criteria toprocessor and to display data retrieved by the processor as arepresentation of an object feature.

The point coordinate data can be surface data or volume data. Themodeled data can comprise 3D triangular mesh data segmented into subsetsof vertices. Extracted features can include points, curves, surface dataand volume data. Segmentation of the modeled data is performed tosimplify the data and to organize the data into features or regions thatare meaningful for applications or to users. One advantageous method forsegmenting the modeled data uses a watershed segmentation method withimproved curvature estimation. Another advantageous method uses a hybridsegmentation scheme. Data compression can be used to compress themodeled data. An advantageous method for compressing the modeled datauses B-spline curves. Another advantageous method uses subdivisionsurface compression. Two advantageous methods for segmenting volume useWeibull E-SD fields and Greedy connected component labeling refinement.

The user interface includes a graphic user interface that provides avisual query system and method that allows a sketch-based search of thedatabase. Representative object shapes can be selected from a paletteand modified, or a freeform profile sketch can be created using thevisual query. The initial query request can be made in a variety ofmodes including text, vector graphics, and interactive 2D and 3D models.The user interface also can include text and numeric fields for parallelquery of descriptive and derived data within the databases. The visualquery interface displays search results, which can include textual dataand graphic data, including 2D and 3D representations of objects thatmatch the search criteria.

The system and method can be used with models to address various typesof data selected to represent different design challenges, volumetricand spatial elements, and modeling requirements.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated in and constitute apart of the specification, illustrate the presently preferredembodiments and methods of the invention Together with the generaldescription given above and the detailed description of the preferredembodiments and methods given below, they serve to explain theprinciples of the invention.

FIG. 1 is a functional block diagram of an exemplary computer for systemstoring, archiving, query and retrieval of information relating to 3Dobjects in accordance with the invention.

FIG. 2 shows an example of an XML schema according to the presentinvention.

FIG. 3 shows an exemplary region editor screen display for editingmodeled data representing 3D objects.

FIG. 4 shows an exemplary query input screen display for inputtingsketch-based, numeric, and text-based search criteria.

FIG. 4 is a flowchart of the visual query process.

FIG. 6 shows an exemplary screen for displaying results of a query inthe form of thumbnail images and summary data for 3D objects returned bythe search.

FIG. 7 illustrates an exemplary screen for displaying further detail ofa specific 3D object selected from among the search results of FIG. 6.

FIG. 8 is a diagram illustrating data flow, knowledge abstraction andquery according to the invention.

FIG. 9 illustrates an example of a triangulated, highly decimated pointcloud data set representing a surface of an object.

FIG. 10 shows in more detail the data flow and structure of thegeometric modeling module of the system of FIG. 1.

FIG. 11 shows in more detail the data flow and structure of the featureextraction, analysis and index generation module of the system of FIG.8.

FIG. 12 has been intentionally omitted.

FIG. 13 shows two pieces of an object, i.e. a core and flake of a stoneartifact, which pieces fit together and each of which pieces has beensegmented to extract its surface features for comparison.

FIG. 14 shows curvature based segmentation.

FIG. 15 shows merging similar regions by increasing the water depth.

FIG. 16 shows a comparison of (A) a Gaussian curvature plot and (B) amean curvature plot.

FIG. 17 shows (A) the min-max box for arbitrary axes and (B) the min-maxbox for the axes of the norm ellipse of the points.

FIG. 18: Figure showing local surface fit for curvature estimation.

FIG. 19: Considering the curvature sign may yield an incorrectsegmentation.

FIG. 20 shows an example where a triangle T_(i) will be such a grayarea: (A) shows an example of segmentation using the watershed methodwith no hard boundaries; and (B) shows the segmentation example of FIG.20A using the watershed method with such a boundary solution.

FIG. 21 illustrates the creation of triangles on boundaries of anoriginal mesh according to the hybrid method.

FIG. 22 shows a mesh segmented using the watershed method andillustrating no hard boundary.

FIG. 23 shows feature vertices and edges of a mesh.

FIG. 24 shows the steps of the hybrid segmentation method.

FIG. 25 shows creation of triangles on feature edges of an originalmesh.

FIG. 26 shows assigning regions for the case having multiple labels withmultiple common labels.

FIG. 27 shows segmentation of mechanical parts using the hybrid method.

FIG. 28 shows segmentation of a turbine and a wheel using the hybridmethod

FIG. 29 shows segmentation of a bone using the hybrid method.

FIG. 30 shows mesh traversal and parallelogram prediction method forEdgebreaker algorithm.

FIG. 31 shows mMesh traversal and recording of the error vector in theB-Spline algorithm.

FIG. 32 shows Left: normal case for recreating the geometry andtopology. Right: handling of special case if the vertex has been visitedor recorded before.

FIG. 33 shows: Example of Cow. Left: Triangle Mesh representation.Right: B-Spline approximation to the edge mid points. Each colorrepresents a different B-Spline.

FIG. 34 shows: Stanford Bunny Top left: Triangle Mesh representation.Top right: Zoomed in view of the mesh. Bottom left: B-Splineapproximation to the edge mid points. Bottom right: Zoomed in view ofthe B-Spline curves.

FIG. 35 shows an example of compression.

FIG. 36 shows a remeshing process according to one aspect of theinvention.

FIG. 37 shows a flow chart of the remeshing process of FIG. 36.

FIG. 38 shows In the Loop subdivision, each subdividing level producestwo types of vertices: vertex points and edge points as shown in FIG.38.

FIG. 39 shows

FIG. 40 shows the construction and comparison of a number of well-knowndata sets.

FIG. 41 shows (A) Polygonal Mesh with watershed defined areas forcomplete vessel and (B) Polygonal Mesh with watershed defined areas forpartial vessel.

FIG. 42 shows shows feature segmentation of complex shaped vessel.

FIG. 43 shows descriptive elements of ceramic vessel.

FIG. 44 illustrates four kinds of vessel bases, i.e.: (a) convex base;(b) flat base with zero curvature; (c) concave base; and (d) compositebase.

FIG. 45 shows a schema used for archaeological vessels.

FIG. 46 describes a schema used for lithics.

FIG. 47 shows a distribution experiment of a LCM volume data: (a) A realLCM volume data; (b) The distribution of the data

FIG. 48 shows the Weibull distribution with different shape parameters aand scale parameter b=1.2 and v₀=0.

FIG. 49 shows function tB(t,t)/2. It reaches maximum 0.72 at t=0.42.12

FIG. 50 illustrates classification of noise in a box by using twoparameters: s1 and s2. (A) Normal sample case: {64, 52, 64, 46, 50, 54,62, 43}, where s1=4.475 and s2=4.9; (B) High noise case {240, 52, 64,46, 50, 54, 62, 43}, resulted from case (A) just added one high noiseterm on purpose, where s1=0.825 and s2=3.6; (C) Low noise case {6, 52,64, 46, 50, 54, 62, 43}, resulted from case (a) just added one low noiseterm on purpose, where _(s)1=3.45 and _(s2)=0.575; (D) Low noise case{6, 52, 64, 46, 50, 54, 62, 234}, resulted from case (A) just added onelow noise term and one high noise term on purpose, where _(s)1=0.725 and_(s2)=0.525.

FIG. 51 shows randomly generating Weibull distribution using (7) withthe scale parameter b1.2 and three distinct shape parameters a=0.75, a=3and a10. (a) The whole volume data (b) the partial volume data with twoco-centric spheres

FIG. 52 shows generating 3D Weibull distributed volume data withdistinct shape and scale parameters. In (A), the blue part (cube) witha=0.75 and b=1.2, the red part (outside sphere) with a=3.0 and b=16.58,and the green sphere (inside) with a=10.0 and b=63.58. The images arerendered using Ray-Casting.

FIG. 53 shows E-SD plot of 3D control volume data above. The cone parton the left corresponds to cube outside of spheres, the disc at themiddle to the external sphere, the small region on the right-top to theinternal sphere. (a) Cube out of spheres (b) external sphere (c)internal sphere.

FIG. 54 shows the segmentation of control 3D Weibull distributed volumedata. The color is RGB(155, v, 0), where v is the value at the point (x,y, z).

FIG. 55 shows test bed for our Weibull E-SD modeling scheme. They comefrom two different experiments. The size of image (A) is 512×512×124,the gray-level is from 0 to 255, and there is only one cell, its spindleis at the up part of the cell. The size of image (B) is 512×512×146, hasthe same gray-level with (A).

FIG. 56 shows the E-SD plots of image (A) and (13) of FIG. 55,respectively. The colors have the same means as FIG. 53. In (A), thesize of window is (178, 237; 4, 27) corresponding to the segmentation(A) in FIG. 57, and the threshold e T=34, so there is a blank on theleft side. There are two clear fingers. In (13), the size of window is(167, 255; 2, 30) corresponding to the segmentation (13) in FIG. 57, andthe threshold e T =34, so there is a blank on the left side. There aretwo clear fingers. The box size is 2×2×2.

FIG. 57 shows the spindle segmentation of FIG. 55, respectively. Thecolor is RGB(0, v, 0), where v is the value at the point (x, y, z). Itare rendered by using OpenGL point clouds. The segmentation in (A)includes 3902 boxes, and 11308 boxes in (B).

FIG. 58 shows the region corresponding to the small finger in FIG. 58A.It is rendered by using OpenGL point clouds from two different views.

FIG. 60 shows a test bed for our volumetric segmentation scheme. FIG.60A shows an LSCM image of GFAP-labeled astrocytes and DAPI stainednuclei, in which the green highlights regions targeted with GFAP labeledastrocytes, and the red regions identify DAPI, which consistentlytargets nuclei cells. FIG. 60B shows a LSCM image of GFAP-labeledastrocytes and electrode implant, in which the red targets implant andthe green GFAP. Both are rendered by MIP in 3D.

FIG. 61 is a plot of function g(a) defined by equation (3) for foursamples which are generated by Weibull pdf with b=1:2, and v0=0:0 anda=0:7; 2; 3, or 10, respectively. And 0.7, 2.1, 3.05 and 9.8 are theroots of g(a)=0 corresponding to the samples.

FIG. 62 shows the Weibull parameter a-b histogram for the LSCM imageshown in FIG. 1(a) and the color legend, where N(a;b) is the number ofvoxels which satisfy Weibull parameters are a, and b

FIG. 63 shows an example of hierarchy frame for a 2D image.

FIG. 64 shows segmentation (a) before GCCL, and (b)after GCCL. Both arerendered by point clouds in 3D.

FIG. 65 illustrates smoothing connected components using convolution: aGFAP labeled astrocyte (a) before smoothing, and (b) after smoothing,and a nuclei (c) before smoothing and (d) after smoothing

FIG. 66 illustrates the segmentation results of the LSCM image of theGFAP-labeled astrocytes and DAPI stained nuclei (FIG. 60A)

FIG. 67 shows the connection analysis for GFAP labeled astrocytes inLSCM image FIG. 60A

FIG. 68 illustrates the segmentation results of the GFAP and implantLSCM image FIG. 1(b), (a) two channel data, and (b) the GFAP-labeledastrocytes, and (c) the electrode implant segmented

DESCRIPTION

Reference will now be made in more detail to the presently preferredembodiments and methods of the invention as illustrated in theaccompanying drawings, in which like numerals refer to like partsthroughout the several views.

The present invention provides a system and method for the storing,archiving, query and analysis of information relating to 3D objects. Thesystem can be used within a distributed environment to catalog, organizeand support interaction with 3D materials and objects.

The Computer Network System

FIG. 1 illustrates in schematic block diagram form an exemplary computernetwork system 100 for practicing the invention. The computer networksystem 100 includes a server computer system 102 and a client computersystem 104, which are connected by data connections 105, 107 to acomputer network 106, e.g., an intranet, the Internet and/or the WorldWide Web, so that the client computer system 104 and the server computersystem 102 can communicate. As will be readily apparent to personsskilled in the art, the client computer system 104 is intended to berepresentative of a plurality of client computer systems 104, each ofwhich may communicate with the server computer system 102 via thenetwork 106, whether sequentially or simultaneously. It is to beunderstood that the terms client and server are to be construed in thebroadest sense, and that all such constructions of the terms areintended to fall within the scope of the appended claims.

With reference to the network 106, it is contemplated according to thepresent invention that each of the clients 104 may access andcommunicate with the network 106 through any data communicationtechnology. For example, the client 104 may comprise one or morepersonal computers that are part of a conventional local area network(LAN) and the data connections 105, 107 may be provided by dedicateddata lines, Personal Communication Systems (PCS), microwave, satellitenetworks, wireless telephones, or any other suitable means. For example,the client computer 104 that can be connected to the Internet via aphone network, such as those provided by a local or regional telephoneoperating company, or a coaxial cable line. In any case, client(s) 104are adapted to communicate with the network 106 such that informationmay be transmitted to and from the server 102, e.g., through one or morerouters, wide area networks (WANs), satellites, hubs, repeaters, bridgesand gateways, as is known in the art. Data transmissions are typicallypassed from network to network in packets that include not only thesubstantive aspects of the data transmission, but addresses, errorchecking information and the like.

The computer network system 10 advantageously makes use of standardInternet protocols including TCP/IP and ITTP. TCP/IP is a commontransport layer protocol used by a worldwide network of computers. HTTPis a known application protocol that provides users access to files(which can be in different formats such as text, graphics, images,sound, video, etc.) using a standard page description language known asHypertext Markup Language (HTML), DHTML or XML. Known HTML web browsersallow for graphical user interface (GUI) based access to HJML documentsaccessible on servers communicatively linked to the client 104. Thesedocuments are commonly referred to as “web pages.” Although the client104 and the server computer 102 are coupled together via the Internet,the invention may also be implemented over other public or privatenetworks or may be employed through a direct connection and any suchcommunication implementation is contemplated as falling within the scopeof the present invention.

Server Computer System

The server computer system 102, which has a conventional architecture,includes: a central processing unit (CPU), which may be implemented witha conventional microprocessor; means for temporary storage ofinformation, which may be implemented with random access memory (RAM);and means for permanent storage of information, which may be implementedwith read only memory (ROM); and means for mass storage of information,which may be implemented by hard drive or any other suitable means ofmass storage known in the art.

It will be obvious to someone of ordinary skill in the art that theinvention can be used in a variety of other system architectures. Asdescribed herein, the exemplary system architecture is for descriptivepurposes only. Although the description may refer to terms commonly usedin describing particular computer system architectures the descriptionand concepts equally apply to other computer network systems, includingsystems having architectures dissimilar to that shown in FIG. 1.

Still referring to FIG. 1, the server computer system 102 includes a Webserver 103, and a database server 105. The Web server 103 managesnetwork resources and handles all application operations between thebrowser-based clients 104 and the server side applications. The databaseserver 105 includes a database management system (DBMS), a collection ofprograms that enables the storing, modification and extraction ofinformation from databases 114. The Web server 103 facilitatescommunication and data exchange between the client 104 and database 114.

One or more data acquisition devices 130 can be used to generate raw 3Ddata about an object and to input the raw 3D data to the server 102.Some examples of suitable data acquisition devices 130 include laserscanners for acquiring 3D surface data and MM scanners, CAT scanners orLaser Confocal Microscopes for acquiring 3D volume data It will beunderstood, however, that the data acquisition devices 130 can includeany device that generates digitized 3D data from an object.

A program kernal 116 executes on the server computer system 102 andimplements the logical operations of the system, as described below. Thekemal 116 includes a geometric modeling module 118, which can derive 3Dand 2D modeled data from the raw 3D data. A feature extraction, analysisand indexing module 120 can operate to provide cataloging, descriptionand interactive access to the modeled data, as well as to stored textualand descriptive data about the object. A data compression module 121 cancompress certain data for enhanced storage and transmission.

A region editor program 132 also executes on the server computer system102. The region editor program 132 functions to break the mesh down into“meaningful” connected subsets of vertices called “regions.”

The database 114 can comprise a number of different databases forstoring various data element. These data elements include: 3D acquireddata 122, such as that generated by the data acquisition device 130;modeled data 124, which the geometric modeling module 118 generates fromthe 3D acquired data; derived data 125, which the extraction, analysisand indexing module derives from the modeled data 124, and textual andnumeric data 126. The database 114 also stores metadata 128 fordescribing the data and how the data is formatted. The schema discussedherein further describe the metadata 128.

The 3D acquired data 122 can be 3D surface data, such as that generatedby optically scanning the surface of an object, or it can be 3D volumedata that includes information about the interior of an object, such asthat obtained from MM scans, CAT scans or Laser Confocal Microscopevolume data. The modeled data 124 is data that has been structured ormodeled from the acquired data 122 using suitable models, such astriangular meshes, surface representation models or volumerepresentation models. The derived data 125 includes data derived fromthe modeled data, such as object features or mathematical descriptivedate extracted from the modeled data. The text and numeric data 126 caninclude, for example, area curvature distribution and the like.

The data elements are organized, cataloged and described according to aschema to facilitate efficient query of and interaction with the data Ina presently preferred embodiment, the schemas are written in XML(eXtensible Markup Language). XML is a standard format for structureddocument/data interchange on the Web. Like HTUL, an XML document holdstext annotated by tags. Unlike HTML, however, XML allows an unlimitedset of tags, each indicating not how something should look, but what itmeans. This characteristic helps facilitate information sharing. FIG. 2depicts an example of an XML schema according to the present invention.The schema can include application specific categories and can beorganized to differentiate general information (e.g., provenence, timeperiod) and 2D information (e.g., height, length) from 3D information.The 3D information data structures at the top-level house originalbinary data files, modeled files, and derived features. One level downthe hierarchy, cognitive pattern primitives (CPPs) store the basic shapedata that form the building blocks out of which meaningful features areconstructed. The schema allows shape grammar and shape algebra to beused to provide a natural language syntax to write a shape as aconstruct of tokens (CPPs) and combination rules. To link this spatial,modeled, and constructed data to descriptive data in existing databases,the schema provides a master identification number for use as the key tolink data elements for a given artifact. This model is consistent withDublin Core cataloging structures and emerging data standards within thediscipline. This design permits a query engine to link the data elementswith additional data in other databases by mapping appropriate datafields between the databases using a master ID#.

Client Computer System

Still referring to FIG. 1, the client 104, which also has a conventionalarchitecture, includes: a central processing unit (CPU), which may beimplemented with a conventional microprocessor; means for temporarystorage of information, which may be implemented with random accessmemory (RAM); and means for permanent storage of information, which maybe implemented with read only memory (ROM); and means for mass storageof information, which may be implemented by hard drive or any othersuitable means of mass storage known in the art; one or more inputdevices for inputting information into the computer, which may beimplemented using a conventional keyboard and mouse, and an outputdevice for display graphical output to the user, such as a conventionalmonitor. The client 104 includes conventional hardware and software forcommunicating over the network 106. In a presently preferred embodiment,this includes a Web browser software application 110, which executes onthe client 104 to access information available on the Internet. Anyconventional browser 110 is contemplated for use according to thepresent invention, e.g., Netscape Navigator or Microsoft InternetExplorer, provided that the browser supports JAVA and HTML 3.0 orbetter. The system and method of the present disclosure advantageouslyutilizes the full functionality of such browser programs, as are knownto persons skilled in the operation and use thereof.

Graphical User Interface

The client 104 utilizes a GUI by which a user can view, store, modifyinformation in and extract information from the database 114. In thepresently preferred embodiment, the GUI comprises a collection of HTMLpages transmitted to the client 104 by the Web server 103 and displayedby the Web browser 110. The GUI includes interactive surface and volumevisualization capabilities and quantification tools to extractcurvature, volume, scale, and linear dimensions for each data set andallows the user to generate a sketch-based of the database 114.

Region Editor

Referring to FIG. 3, an exemplary client screen display 208 is providedfrom which certain aspects of the region editor 132 may be described.Client screen display 200 is representative of region editor screenlayouts that may be viewed by users of clients 104. A display window 180displays modeled data as an image 182 and numeric data. The user canedit the modeled data using a toolbar 184.

Visual Query

The GUI also includes a visual query interface, which allows users toinput, analyze, refine and limit searches of the database 114. Thevisual query interface combines a sketch-based search capability withtraditional text and metric data. Referring to FIGS. 4-7, an exemplaryclient screen display 200 is provided from which certain aspects of thevisual query interface may be described. Client screen display 200 isrepresentative of visual query screen layouts that may be viewed byusers of clients 104. The client screen display 200 can display a queryinput screen 202 including a shape input window 204 and a text inputwindow 206. The shape input window 204 allows a user to input a queryimage 206, which can be in the form of vector graphics or a 2D and 3Dimage. The query image 208 can be created as a freeform profile sketchor it can be selected from a palette of stored representative objectshapes. Using a toolbar 210, the user can manipulate the query image 208in real time to refine searches. The text input window 206 includesinput fields 212 for inputting textual and numeric search parameters.These support parallel query of descriptive and derived data within thedatabases 114. The user can manipulate and resubmit the query image inreal time to refine searches. After a user has entered the desiredcriteria into the query input screen 202, a “submit” button 213 can beselected to submit the query to database 114.

Referring to FIG. 5, when the user submits a query, the client 104 sendsthe search criteria to the Web server 103 as binary and textual data(step 250). The server system 102 then searches the databases 114 tomatch text and numeric, and calculated feature data, with the searchcriteria. A Common Gateway Interface (CGI) program accepts the data fromthe Web server 103 and parameterizes the search criteria (step 252). Thekernal 116 then extracts features, using appropriate feature extractionalgorithms as described below, from the sketch and executes a search ofthe database 114 (step 254). In a presently-preferred embodiment, JavaServer Pages (JSP) are used to implement the dynamic database queries,but it will be understood that other suitable technologies such asActive Server Pages (ASP) also can be used. The database server 105queries the database 114 (step 256) and returns the search results tothe kernal 116. The kernal 116 then compiles and tabulates the searchresult data (step 258). In a presently-preferred embodiment and method,XML is used represent the search results. Query results from thedatabase 114 are stored in XML format, and are visualized via apre-designed Extensible Style Language (XSL) file. The XSL file providesthe data needed to view detailed information for any of the returnedobjects without need for further interaction with the server 102 anddatabases 114. This includes thumbnail images and descriptive data for apredefined number of objects in the search results. The Web server 103sends the processed search results to the client 104, where they arecached (step 262) and displayed by client 104 via the Web browser 110(step 264), which extracts thumbnail and summary data from the XSL file(step 414) and displays the thumbnail and summary data in the initialsearch result screen.

FIG. 6 illustrates an exemplary search result screen display of initialsearch results. after the client 104 receives the search results, itdisplays those search results in the result window 204. The displayedresults can include thumbnail images 214 of objects having featuresmatching the search criteria as well as and descriptive data 216 forthose objects. When a user selects one of the returned thumbnail images214, which prompts the client browser 110 to extract a manipulable 3Dmodel of the selected object and detailed descriptive data. The clientbrowser 110 then displays the 3D model 218 of the selected object alongwith additional descriptive data 220 for that object. The user canmodify the query sketch by returning to the initial search result screenor new search screen. To view more details associated with the selectedthumbnail image, the user can select a detail link 221 which will causea separate display page to be called (step 266) and displayed as a fullresults display 268. FIG. 7 illustrates an example of a screen displayof the full results of a search. The full results display 268 includesadditional 2D, 3D and descriptive data for the selected thumbnail,including features such as a 2D profile curve data 222, curvature plotdata 224 and full textual and numeric data 226.

Systems Operation

Referring to FIG. 8, additional aspects of the operation of the system100 will now be described in further detail. Generally, operation of thesystem includes a data acquisition process 300, modeling of the acquireddata 302, segmentation and feature extraction 304, feature analysis andindex generation for storage of data 306, and online display and queryof the stored data 308.

Data Acquisition

In the data acquisition process 300, 3D raw data is input to the system100. This raw data can be 3D surface data, such as that generated byoptically scanning the surface of an object, or it can be 3D volume datawhich includes information about the interior of an object, such as thatobtained from MRI scans, CAT scans or Laser Confocal Microscope volumedata. In the example of 3D surface data, when an object is scanned usinga laser scanner, several tens or hundreds of thousand point coordinatesare generated, each representing a location on the object. Thiscollection of points is referred to as a ‘point cloud.’ It has nostructure, and is simply a file containing point coordinate data as x,y, z values. Point cloud information also can be generated usingcomputer programs, either interactively or automatically frommathematical functions, as is well-known in the art. In either case, towork with this point cloud, it has to be structured or modeled.

Geometric Modeling

The geometric modeling module 118 operates on the acquired 3D data toperform geometric modeling (step 302) of that data. FIGS. 8, 10 and 11illustrates the data flow and structure of the geometric modeling module118. Geometric modeling can include modeling of 3D surface data intosurface representation models 310 as well as modeling of 3D volume datainto volume representation models 312. (FIG. 10)

Polygonal Meshes

One advantageous method to model 3D data is to triangulate the pointcloud to generate a mesh of triangles (Step 314) 314 having thedigitized points as vertices. 3-D triangular meshes are a well-knownsurface modeling primitive used extensively to represent real world andsynthetic surfaces in computer graphics. Each triangle ‘knows’ about itsneighbors, which is the structure that allows fast processing of thegeometry represented by the triangulation. FIG. 9 illustrates an exampleof a triangulated, highly decimated point cloud data set representing asurface of an object 316, i.e. a ventral surface of a stone flake. It isimportant to stress that the same set of vertices or data points couldhave several triangulations. The emphasis therefore, is on the verticesthemselves, rather than the ‘surface’ as represented by the triangles.

Surface Models

Because triangulation is a modeling primitive, the geometric modelingmodule 118 uses additional modeling steps to simplify the data andextract regions or object features that are meaningful for applicationsor to users. For example, to study surface contour shapes the surfacesmust be modeled. If an object is inherently smooth, it is advantageousto represent its surface by a smooth data format. One advantageoussurface representation model is via a parametric surface such an a NURBS(Non-Uniform Rational B-Spline) data set 316. A NURBS data set consistsof control points (besides degree and parameterization) arranged in agrid of points, roughly representing a given shape. Fitting points ofpolygonal meshes with least squares approximation generates such surfacemodels. NURBS data sets provide a compact data representation and makeit easy to calculate curvatures of objects to extract their essentialshape characteristics. Using such representations enables one to rebuildmodels, analyze properties such as curvatures and make quantitativemeasurements, as well “repair” incomplete models. A NURBS surface can berepresented as: $\begin{matrix}{{\overset{\_}{P}\left( {u,v} \right)} = \frac{\sum\limits_{i = 0}^{m}{\sum\limits_{j = 0}^{n}{w_{i,j}{\overset{\_}{d}}_{i,j}{N_{i,k}(u)}{N_{j,l}(v)}}}}{\sum\limits_{i = 0}^{m}{\sum\limits_{j = 0}^{n}{w_{i,j}{N_{i,k}(u)}{N_{j,l}(v)}}}}} & (1)\end{matrix}$where d_(ij), i=0, 1, . . . ,m; j=0, 1, . . . n are control points,w_(ij) are weights, N_(i,k)(u) and N_(j,l)(v) are B-Spline basisfunctions. When weights equal 1.0, this reduces to a non-uniformB-Spline surface.

2D Geometric Models

2D models can be used to simplify problems presented by 3D analysis. Forexample, 2D curves may be used to describe an object, such as a profilecurve used to describe an archeological vessel. To obtain a profilecurve, a cutting plane is projected through the 3D mesh representing theobject, and intersection points with the mesh are extracted and orderedin a chain code. NURB curves can be generated by fitting points of thechain code with least squares approximation.

One of the important characteristics of a curve is the curvature. Thecurvature can be very useful for analysis and classification of objectshape. In 3D space the curvature of curves is unsigned. However, forplanar curves in 3D space, positive curvatures can be converted intosigned curvatures. Curvature has useful information such as convexity,smoothness, and inflection points of the curve, and if this isinformation needed for analysis, cubic NURB curves can be used toapproximate profile curves $\begin{matrix}{{\overset{\_}{P}(u)} = \frac{\sum\limits_{i = 0}^{n}{w_{i}{\overset{\_}{d}}_{i}{N_{i,k}(u)}(v)}}{\sum\limits_{i = 0}^{n}{w_{i}{N_{i,k}(u)}}}} & (2)\end{matrix}$where {overscore (d)}_(i), i=0, 1, . . . , n are control points, w_(i)are weights, and N_(i,k)(u) are B-Spline basis functions. In oneexemplary application, because measures of curvature such as convexity,smoothness, and inflection points of the curve were needed for vesselanalysis, cubic B-Spline curves have been used to approximate profilecurves of archeological vessels. The accuracy of these derived curvesfar exceeds the manually sketched or mechanically derived curvescurrently used to describe these vessels. The result is a profile curvefrom which appropriate features such as corner points, end points, andsymmetry between both halves of the profile curve can be readilycomputed and extracted.

FIG. 7 illustrates a vessel image 218, its profile curve 222 and thesigned curvature plot 224 of the curve.

Feature Extraction and Segmentation

Referring to FIGS. 8 and 11, after modeling of the acquired data (step302), the feature extraction, analysis and index generation module 120performs feature extraction and segmentation (step 304) so thatappropriate feature data can be extracted for study. Although features(such as corner points on a pot, joint surfaces on a bone, and flakenegatives on lithics) are a diverse set, they can be extracted inaccordance with the invention using a common set of algorithms. As shownin FIG. 11, extraction of features from such objects and the use ofthese features to search for particular shapes involve multiple levelsof analysis and extraction algorithms. As a common denominator, however,the features are built from cognitive pattern primitives (CPP), whichare essentially the geometrically meaningful feature building blocks.Examples of CPPs include curves, for example a part of a profile curve,and surfaces such as segments of the surface of an object. Buildingfeatures from these primitives requires a construction method thatallows the CPPs to be arranged and connected in various ways and alsofor these connections and arrangements to be described mathematically.Features can be defined, then, by a set of CPPs and a set of operatorsto connect and arrange these CPPs.

Extracting features from modeled data can take on various formsdepending on the desired information. According to one exemplary featureextraction method, features are separated into four categories: point,curve, surface data and volume data. The extraction of point features,curve features and surface data features will now be described withreference to an exemplary application, i.e. the study of ceramicvessels.

Extracting Point Features: Corner Points

Using chain codes and NURB curves as described above, profile curves canbe extracted from modeled data. These profile curves can have a numberof points of interest to a user, such as corner points. Corner pointscan be described as an abrupt change in the orientation of an objectwall, or a distinct angle in the joining of object parts, such as neckand body of a vessel. For the automatic extraction of such points on aprofile curve there is a need to adequately describe this featuremathematically. Hence, corner points are points of maximum curvaturechange on a profile curve and are extracted by comparing curvaturevalues for all points on the profile curve.

The following algorithms have been developed to extract point features:

Algorithm for Extracting End Point Features

Input: a profile curve represented by B-Spline curve and chain coderespectively.

Output: end point features

1. end point 1 □ start point of chain code; end point 2 □ end point ofchain code;

2. center point □ center point of chain code;

3. find the base section around center

4. if base section is flat or concave then

-   -   total end point number □ 4; end point 3 □ left terminate of base        section; end point 4 □ right terminate of base section;    -   else {base is convex}    -   total end point number □ 3; end point 3 □ center;

5. calculate feature information for each end points, include spacecoordinates, parameter value, position on the chain code, and so on;

Algorithm for Extracting Corner Point Features

Input: a profile curve represented by B-Spline curve and chain coderespectively.

Output: corner point features

1. calculate curvature value for each points on chain code;

2. find points with local maximum (minimum) curvature value ascandidates for corner points;

3. for all candidates

-   -   if angle at the candidate point<a predefined value then the        candidate point is a corner point.

4. calculate feature information for each corner point, include spacecoordinates, parameter value, position on the chain code, and so on.

Inflection features and point of vertical tangency features can bedetermined because inflection point and point of vertical tangency canbe found by analyzing curvature value and tangent lines.

When computing the angle between points (x₁, y₁), (x₀, y₀) and (x_(r),y_(r)) in the algorithm for extracting corner features, the angle issensitive to sample error. In order to reduce the error due to sampling,instead of taking (x₁, y₁) and (x_(r), y_(r)) as points of the curve,the coordinates of these points are calculated by averaging thecoordinates of a group of neighbors to perform a less noise-proneresampling.

Consider the midpoint (x₀, y₀) of n contiguous points in a chain code ofa curve, where n is an odd number, and let p=n/2+1 be the point (x₀,y₀). Thus, the initial point of the angle (x₁, y₁) is calculated fromthe n/2+1 previous point as $\begin{matrix}{{x_{l} = \frac{\sum\limits_{i = 1}^{p}x_{l}}{{n/2} + 1}},{y_{l} = \frac{\sum\limits_{i = 1}^{p}y_{l}}{{n/2} + 1}}} & (3)\end{matrix}$

and similarly for the end point of the angle (xr, yr) $\begin{matrix}{{x_{r} = \frac{\sum\limits_{i = p}^{n}x_{l}}{{n/2} + 1}},{y_{r} = \frac{\sum\limits_{i = p}^{n}y_{l}}{{n/2} + 1}}} & (4)\end{matrix}$

Extracting Curve Features: Profile Curves

Profile curves include various other sources of valuable information.For example, one could be interested in the symmetry between left andright side of the profile curve. This information can be extracted intoa signed curvature plot, and provides a valuable tool for the evaluationof object symmetry. Further an ideal model can be generated from theprofile curve and used to evaluate how a real object deviates from thismodel.

Extracting Surface Features: Facets on Surface

Another level of feature extraction deals with surfaces. By extractingsurface information, one can retrieve 3D information rather thanreducing a 3D phenomenon to 2D information. To extract surfaceinformation, the complete object has to be segmented into disciplinespecific and geometrically meaningful surfaces. Examples of suchsurfaces are the different facets on a lithic artifact such as theventral surface, partial dorsal surfaces, and heel. These surfaces formthe building blocks in any effort to refit these artifacts into acomplex 3D jigsaw puzzle. FIG. 13 shows two pieces of an object, i.e. acore and flake of a stone artifact, which pieces fit together and eachof which pieces has been segmented to extract its surface features forcomparison.

Segmentation in this context thus refers to the problem of extractingfeatures or regions of interest from surfaces. According to one aspectof the invention, the computerized segmentation of triangular meshsurfaces uses absolute curvature estimates along with the basic ideas ofa Watershed algorithm to identify regions of interest and similarity onthe surfaces of objects. Applications of this method have been foundparticularly useful in recognizing facets on a lithic artifacts, such asthe lithic artifact shown in FIG. 13. These regions in turn form thebasis for searching for matches between stone artifacts.

Based on the idea of a watershed as used in geography, in which awatershed forms the dividing line between drainage basins, thecomputerized watershed segmentation scheme defines minima to which waterwould flow from the peaks surrounding that minima. In contrast to ageography application, where this principle uses elevation in relationto neighboring areas as the defining characteristic for subdivision,this application uses absolute curvature allowing the recognition ofpeaks as well as valleys as the separating lines between watersheds. Itwas found that absolute curvature aided by smoothing equations yieldedsuperior results to Gaussian curvature.

A first step in the segmentation process is the computation and storageof curvature at each vertex or data point in the original point cloud.(See FIG. 14) The curvature calculation for each point is based on apatch of nine or more points, around a particular vertex and the vertexitself. Next, absolute curvature minima are selected and form the“sink-holes” from individual regions. Subsequently, vertices areassigned to a specific minimum by determining the way the imaginarywater would travel (down the steepest drop in absolute curvature in thiscase). The initial regions are typically too numerous to be useful,therefore, we increase the “watershed depth” to merge adjacent similarregions (See FIG. 15). Making the action of determining the watersheddepth user defined we ensure flexibility to meet researcher needs. Whilesignificant depths that allow for the recognition of flake scars aretypically found at similar watershed depths, these depths tend to bedifferent for other application such as the recognition of jointsurfaces or parts of a pot such as neck, body & base (FIG. 15). Thesegmentation algorithm is described in more detail below.

Methods for Extracting Curve Features

Improved Curvature Estimation for Watershed Segmentation of3-Dimensional Meshes

One advantageous method for extracting curve features from 3D meshesaccording to the invention uses an improved curvature estimation schemefor Watershed segmentation of the 3D meshes. The method improves uponthe scheme proposed in A. Mangan and R. Whitaker. Partitioning 3DSurface Meshes Using Watershed Segmentation, IEEE Transactions onVisualization and Computer Graphics. Vol. 5, No. 4, Oct-Dec 1999 (Manganand Whitaker) for segmenting a 3D polygonal mesh representation of anarbitrarily shaped real world object.

The domain of the problem, called a 3D mesh (denoted by ) consists of aset of n points (vertices v_(i) εE³; 0≦i<n) and a set of planar convexpolygons (faces) made up of these vertices. 3D meshes are currently apopular surface modeling primitive, used extensively to represent realworld and synthetic surfaces in computer graphics. In order to generatemeshes, real world objects are often either digitized, i.e. pointsamples are collected from the surface of the object and theirconnectivity generated or sampled as a volume from which an isosurfaceis extracted. Alternately, the points (and their connectivity) aregenerated using computer programs, either interactively (e.g. using asurface design tool) or automatically (e.g. from a mathematicalfunction). A mesh is a digital or discretized structure representingsome continuous surface.

Segmentation means breaking down an existing structure into meaningful,connected sub-components. In the context of 3D meshes, thesub-components of the structure being broken down are sets of verticescalled regions which share some “commonality.” A value λ_(i) could beassociated with each vertex v_(i) in the data set which somehowencapsulates the characteristics of the locality of the vertex. Thedefinition of segmentation is one in which regions consist of connectedvertices which have the same λ (within a tolerance). Curvature isselected as the mathematical basis for region separation, i.e. thescalar value λ. Previously, curvature estimation from 3D meshes has beendealt with by extracting curvature from a locally fitted quadraticpolynomial approximant or by various discrete curvature approximationschemes. Vertices having the same curvature value would be grouped intoregions, separated by vertices with high curvature (which serve asregion boundaries).

The segmentation scheme described herein is derived from the watershedalgorithm for 3D meshes described in Mangan and Whitaker. The watershedalgorithm for image segmentation is well known in the 2D imagesegmentation field. Mangan and Whitaker generalize the watershed methodto arbitrary meshes by using either the discrete Gaussian curvature orthe norm of covariance of adjacent triangle normals at each mesh vertexas the height field, analogous to the pixel intensity on an image gridwhich drives the 2D watershed segmentation algorithm. Although the novelalgorithm described herein is derived from Mangan and Whitaker, itdiffers in the curvature estimation scheme details and is far morestable for real world noisy data.

Surface Curvature

The quality of results from the watershed segmentation algorithm dependssubstantially on the accuracy and stability of the estimated curvaturevalues. Curvature at the mesh vertices should faithfully reflect thelocal properties of the underlying surface. Additionally, it is usefulto have a curvature calculation technique that can a) estimate curvatureaccurately on mesh boundaries where there usually aren't enough verticesdistributed evenly around the vertex in question, and b) is relativelyunaffected by noise in the data.

To provide an understanding of the improved curvature estimation scheme,various terms used in the theory of surface curvature will be brieflydescribed.

The parametric surfaces considered are of the formx=x(u); u=(u,v)ε[a,b]⊂R ²  (1)

where u and v are parameters which take real values and vary freely inthe domain [a, b]. The functions x(u, v)=(x(u, v), y(u, v), z(u, v)) aresingle valued and continuous, and are assumed to possess continuouspartial derivatives.

The first fundamental form, denoted by I is given byI={dot over (x)}.{dot over (x)}=Edu ²+2Fdudv+Gdv ²  (2)

whereE=x _(u) ² =x _(u) .x _(u) , F=x _(u) .x _(v) , G=x _(v) ² =x _(v) .x_(v).  (3)

The second fundamental form, denoted by II given byII=Ldu ²+2Mdudv+Ndv ²,  (4)

whereL=Nx _(uu) , M=Nx _(uv) , N=Nx _(vv).  (5)and N is the surface normal at point x.

The normal curvature of the surface at point x in the direction oftangent t is given by $\begin{matrix}{\kappa_{0} = {{\kappa_{0}\left( {x;t} \right)} = {\frac{II}{I} = \frac{\left( {Lu}^{\prime} \right)^{2} + {2{Mu}^{\prime}v^{\prime}} + \left( {Nv}^{\prime} \right)^{2}}{\left( {Eu}^{\prime} \right)^{2} + {2{Fu}^{\prime}v^{\prime}} + \left( {Gv}^{\prime} \right)^{2}}}}} & (6)\end{matrix}$

Since the normal curvature is based on direction, it attains maximum andminimum values, called the principal curvatures. The principalcurvatures, K₁ and K₂, can be combined to give us Gaussian curvature,given by $\begin{matrix}{K = {{\kappa_{1}\kappa_{2}} = \frac{{LN} - M^{2}}{{EG} - F^{2}}}} & (7)\end{matrix}$

Gaussian curvature is invariant under arbitrary transformations of the(u, v) parameters of a surface as long as the Jacobian of the (u, v)transformation are always non-zero. Gaussian and mean curvature areinvariant to arbitrary rotations and translations of a surface becauseof the invariance of the E, F, G, L, M, N functions to rotations andtranslations.

Discrete Curvature Schemes

For the purpose of comparison, we first consider the two discretecurvature estimation schemes used in Mangan and Whitaker, i.e. thediscrete Gaussian curvature scheme and the norm of the covariance of thesurface normals of the triangles adjacent to the vertex at whichcurvature is being calculated.

Discrete Gaussian

The platelet P_(i) of a vertex v_(i) is defined as the set of alltriangles in the mesh sharing v_(i) as a vertex. The discrete Gaussiancurvature K at vertex v_(i) is given by $\begin{matrix}{K = \frac{2_{\pi} - {\Sigma\quad\alpha_{j}}}{\frac{1}{3}\Sigma\quad A_{j}}} & (8)\end{matrix}$where a_(j) is the angle subtended at v_(i) by each triangle j in P_(i),and A_(j) is the area of that triangle.

A problem that presents itself immediately is how to find the plateletof a vertex on the boundary of a mesh that is not closed (e.g., a meshrepresenting a sphere would be closed, i.e., have no boundary, while onerepresenting a bounded plane would not be closed). The platelet of aboundary vertex would be incomplete, giving rise to an inaccuratecomputed value for curvature. One option is to simply ignore boundaryvertices while computing curvature (setting their curvature to somefixed value). However, this often gives rise to unpleasant resultsduring segmentation because such vertices tend to establish their ownregions distinct from the rest of the mesh.

A solution to the problem would be to extend the mesh beyond theboundary in some “meaningful” way, allowing a curvature computationusing the extended platelet. Such an extension is not trivial. It wouldhave to somehow follow the shape of the mesh (and additionally, it isdesirable that the extension be symmetric) in order for the boundaryvertices to blend correctly into the existing mesh.

Gaussian curvature is an isometric invariant property of a surface. Anisometric invariant is a surface property that depends only on the E, F,G functions (and possibly their derivatives). Isometric invariants arealso known as intrinsic surface properties. An intrinsic property doesnot care how the surface is embedded in higher dimensional space. On theother hand other curvature metrics such as mean and absolute curvatureare an extrinsic property which does depend on the embedding in 3Dspace. Intrinsic properties do not change sign when the direction of thenormal vector of the surface is reversed. This is because the firstfundamental form does not depend on the surface normal, while the secondfundamental form does. Hence, Gaussian curvature maintains its sign whenthe direction of the normal vector is flipped whereas mean curvature(described below) flips its sign.

Isometric invariance is an undesirable property as far segmentation isconcerned. For example, in FIG. 16, we would ideally want the left andright halves of the surface segmented into separate regions at theridge. The Gaussian curvature plot shows that the curvature is constantthroughout. The segmentation, being based on curvature distribution,would be unable to distinguish the halves. The mean curvature plot, onthe other hand, shows the ridge being distinctly marked by highcurvature. As an additional disadvantage, Gaussian curvature being theproduct of the principal curvatures, is more sensitive to noise.

Norm Method

The second method used in the original paper computes the norm of thecovariance of the surface normals of the triangles adjacent to thevertex at which curvature is being calculated. This method, which weshall refer to as the Norm method, is more resilient to noise in thedata. It is better than using Gaussian but does not perform as well asmetrics used in the improved curvature described herein. This also hasno direct geometric significance.

Improved Curvature Schemes

The principal curvatures can be combined in other useful andgeometrically meaningful ways such as mean, RMS and absolute curvatures.

Mean curvature is given by $\begin{matrix}{H = {\frac{\left( {\kappa_{1} + \kappa_{2}} \right)}{2} = {\frac{1}{2}\frac{{NE} - {2{MF}} + {LG}}{{EG} - F^{2}}}}} & (9)\end{matrix}$

The mean curvature, being the average of the principal curvatures, isless sensitive to noise in numerical computation than the principalcurvatures.

Root mean square curvature (RMS) is a good measure of surface flatnessand is given by $\begin{matrix}{\kappa_{rms} = \sqrt{\frac{\left( {\kappa_{1}^{2} + \kappa_{2}^{2}} \right)}{2}}} & (10)\end{matrix}$

and can easily be computed asκ _(rms)={square root}{square root over (4H ² −2K)}  (11)

A value of κ_(rms)=0 at a point indicates a perfectly flat surface inthe neighborhood of that point.

Absolute curvature is given byκ_(abs)=|κ₁|+|κ₂|  (12)

Mean and RMS curvatures have a closed form as given by equations (9) and(11) and these do not require actual computation of principalcurvatures. κ₀ and κ₁ are expensive to compute and hence the cost ofcomputing Absolute curvature is considerably higher than computing meanand RMS counterparts.

Curvature Estimation

Curvature estimation from a triangulated mesh can be a non-trivialproblem depending on the accuracy required and the method used. Ourmethod uses a technique where the surface is approximated locally by abiquadratic polynomial whose second order mixed partial derivatives arecalculated. These in turn allow computation of the E, F. G, L, M, Nfunctions. An added advantage of fitting a surface locally is that thesensitivity of the surface to noise can be “adjusted” by using smoothingequations.

The parametric surface used is a tensor product Bézier surce and isgiven by: $\begin{matrix}{{{{x\left( {u,v} \right)} = {\sum\limits_{i = 0}^{2}{\sum\limits_{j = 0}^{2}{b_{i,j}{B_{i}^{2}(u)}{B_{j}^{2}(v)}}}}};u},{v \in \left\lbrack {0,1} \right\rbrack}} & (13)\end{matrix}$

b_(ij) are called Bézier control points: in their natural ordering theyare the vertices of the Bézier control net.

We perform a standard least squares fit to a set of vertices in order tocompute the b_(ij). The following were used to improve the local fit ofthe surface (FIG. 18).

1. Use double star of the vertex as input data points.

2. Add smoothing equations to the least squares solution

Selecting more vertices around the vertex for which the curvature isbeing computed ensures there are no gaps and the resulting surface doesnot behave wildly. The second option invokes constraints that a goodcontrol net would satisfy. One such constraint is to minimize the twistof the control net, which, for a surface is its mixed partial ∂²/∂u∂v.The twist surface of b^(m,n) is a Bézier surface of degree (m−1, n−1)and its vector coefficients have the form mnΔ^(1.1)b_(ij). GeometricallymnΔ^(1,1)b_(ij) measures the deviation of each sub-quadrilateral of theBézier net from a parallelogram. Minimizing the twist has the effect ofmaking each sub-quadrilateral close to a parallelogram. The constraintcan be stated as follows:Δ ^(1,1) b _(i,j)=0; 0≦i≦m−1, 0≦j≦n−1.  (14)From Equation 14, the condition can be restated asb _(i+1,j+1) −b _(i,j+1) =b _(i+1,j) −b _(i,j).  (15)

Finally, fitting a surface of higher degree than 2 gives the surfacegreater freedom to match the underlying data points. However, surfacesof higher degree come at a higher computational cost. Moreover, fitoffered by any surface of greater degree than a quadratic is usually acase of diminishing returns. A second-degree surface suffices forevaluation of second order partial derivatives.

The first three solutions presented above make the linear systemover-determined. This system can be solved using least-squares methods.However, we need additional information i.e. parameter values associatedto each vertex. This is solved as follows.

The data points (vertices) can be parameterized by projecting them ontoa plane and scaling them to the [0, 1] range. If the data points (in 2D)are distributed as shown in FIG. 17A, with the min-max box as shown,scaling them to the [0, 1] range results in areas in the domain with nodata points in them. This is undesirable as explained previously. To geta better parameterization we obtain as small an enclosing box aspossible. This can be done by selecting a local coordinate frame suchthat the min-max box is minimized. This problem can be solved by findingthe axes of the smallest ellipse that completely encloses the points.The axes of such an ellipse, called the norm ellipse, would then resultin the smallest min-max box for the point set (see FIG. 17B).

The Watershed Algorithm

The watershed algorithm will now briefly be described. More details canbe found in the in Mangan and Whitaker and the literature cited thereinOnce the curvature κ_(i) at each vertex v_(i) in the mesh has beencomputed and stored, the segmentation can begin. κ_(i) here represents ageneric curvature metric and can be Gaussian, absolute, etc. Duringsegmentation, a label will be assigned to each vertex v2 in the meshindicating to which region the vertex belongs.

The 3D watershed segmentation scheme aims to segment the mesh intoregions of relatively consistent curvature having high curvatureboundary vertices. It is instructive to imagine the curvaturedistribution “function” (or the height function) as a surface on thegiven surface. The watershed algorithm operates on this (curvature)surface.

Convex areas (elevations) on the object have a high magnitude ofcurvature with a positive sign, while concave areas (depressions) have ahigh magnitude of curvature with a negative sign (except for curvatureswhich are positive by definition). High positive curvature would appearas a ridge on the curvature distribution function, while a high negativecurvature would manifest itself as a valley. Since the originalwatershed segmentation algorithm segments the surface into regionsseparated by areas of high curvature, the valleys would not act asregion separators as would be expected (see FIG. 19A). We onlyconsidered the magnitude of curvature (FIG. 19B) to solve this problem.

Results and Conclusions

A flexible segmentation program was implemented, with user controlprovided for changing parameters and thresholds in order to arrive atthe desired segmentation of a surface. Here we show the results of oursegmentation process.

Among continuous methods, mean curvature was more resilient to noise innumerical computation since it averages the principal curvatures. Beingan extrinsic property, mean curvature produced results closer to theexpected results which were based on user perception of shape. RMScurvature had its advantages when dealing with specialized segmentation.However, absolute curvature resulted in segmentations that generallyoutperformed all others. Like mean, the absolute curvature is thesummation of (the absolute values of) κ_(i) and κ₂ giving it greaternoise resilience. Moreover, it is positive by definition unlike mean andGaussian curvatures.

Comparison of Curvatures and their Estimation Techniques

Continuous curvature estimation methods resulted in more accuratecurvature estimates than discrete methods. This was because:

-   -   Surfaces could be fitted to larger vertex neighborhoods,        increasing the closeness of the fitted surface to the actual        underlying surface.    -   Smoothing equations tended to offset the effect of missing data        points and high frequency noise in the data set.    -   Boundary vertices did not need special consideration as in the        discrete case since the surface could be fitted to more vertices        on one side of the boundary.

Segmentation using different curvature metrics. Use of Mean and RMSresults in almost the same quality segmentation as use of Absolute at acheaper computation cost.

Segmentation of human trapezium (thumb joint) was performed using threemethods for curvature approximation. Discrete Gaussian and Norm methodsresulted in too many regions inconsistent with the geometry of thesurface.

Segmentation of a lithic using three methods for curvatureapproximation. Discrete Gaussian and Norm methods produce geometricallyinconsistent region where as absolute curvature results in correctsegmentation. Worst Gaussian (K) Mean (H) RMS (κ_(rms)) Best Absolute(κ_(abs))

Table 1: Comparison of Curvature Measures for Segmentation

From the foregoing, it can be seen that an advantageous method forsegmenting a 3D mesh has been presented using a 3D mesh watershedsegmentation scheme. The segmentation is based on curvature, andtherefore several curvature estimation techniques have been implementedand analyzed. The robustness of segmentation has been improved byincreasing the accuracy of the curvature estimates by use of mean, RMSand absolute curvatures. The method successfully segmented several realworld data sets, both digitized and synthetic, and often noisy,primarily because of the effort that went into developing bettercurvature estimation techniques.

The method can be implemented in a program is semi-automatic since theresult is dependent on the merging threshold user input. One of thereasons why a single threshold does not produce the desired result onall data sets is that surfaces come in various sizes, and curvature isnot scale invariant. Uniformly scaling the smallest bounding box of theinput surfaces to a fixed size alleviates the problem to some extent. Infact, it was noted that surface data sets of the same type (e.g., pots)responded with very similar results for the same threshold values. Theother reason for this is that a perfectly valid segmentation maydisagree with the “expected” user perceived segmentation. This wasespecially true in this research since the segmentation goal for datasets from different disciplines was driven by the requirements ofexperts from those disciplines.

A Hybrid Approach to Feature Segmentation of Triangle Meshes

Another advantageous method for providing feature segmentation oftriangle meshes according to the invention will now be described.Previous approaches for segmentation of polygonal meshes use avertex-based method. A significant drawback of the vertex-based methodis that no hard boundaries are created for the features or regions. Eachvertex of an object has its own region information. Therefore, triangleson boundaries have multi-region information. The three vertices of atriangle can be part of three different regions, whereas the triangleitself would be a “gray” area, i.e. it would not belong to any oneregion. FIG. 20 shows an example where a triangle T_(i) will be such agray area. FIG. 20A shows an example of segmentation using the watershedmethod with no hard boundaries. In FIG. 20A the boundary triangles areshown in the white region. This means the regions will not have hardboundaries or edges. This is a known artifact of vertex-based Watershedsegmentation and is acknowledged in Mangan and Whitaker. To provide asolution for this “no hard boundary” problem, the vertex-based method offeature segmentation creates new triangles by adding mid-points to theedges that have different region labels. FIG. 20B shows the segmentationexample of FIG. 20A using the watershed method with such a boundarysolution.

FIG. 21 illustrates the creation of triangles on boundaries of anoriginal mesh according to the hybrid method. FIG. 21A shows thecreation of triangles on boundaries of an original mesh for a triangleshared by two regions. FIG. 21B shows the creation of triangles onboundaries of an original mesh for a triangle shared by three regions.Referring to FIG. 21, each new vertex contains multiple labels. This isan extension of the original algorithm. For the selection of thelabel(s) of a vertex which has multiple labels, the common label of thevertices of the triangle is selected. In FIG. 21A, there are twopossible diagonal splits to make triangles. We select the diagonal whichsatisfies max(min_(1≦i≦4)a_(i), min_(1≦i≦4)b_(i)) where a_(i) and b_(i)are interior angles formed by the diagonals, i.e. select the diagonalthat results in the best aspect ratio. Triangle meshes representingmechanical (CAD) objects are frequently sparse in some areas, with justenough vertices to define each area. This is usually a result of theoptimal triangulation from a CAD program or decimation process. In thiscase, the Watershed method may not segment the object properly or mayeven lose important regions on the objects. In FIG. 22, the main regionsof the object are treated as boundaries because there are not enoughvertices in the regions. The boundary solution mentioned above will notsolve this problem because the method does not create new regions fromthe boundary regions and the boundary regions will be lost. Moreover,some regions of this object are not segmented properly. This problem iscaused by the vertices on feature edges with similar curvatures andthose vertices may be treated as the same region. We solve this problemwith our proposed no hard boundary approach.

Feature Segmentation Based on Dihedral Angle

This method uses an edge-based method for defining boundaries. A FeatureEdge is defined as an edge shared by two planes whose normal vectorsmake an angle greater than a certain threshold. The edges obtained areintegrated into curves, and these curves are classified as jumpboundaries, folds (roof edges) and ridge lines. Jump boundaries andfolds are used to segment the mesh into several regions. The boundarylines are also treated as Feature Edges.

The main disadvantage of the Feature Edge-based method is that thisresults in many disconnected Feature Edges and thereby incompleteFeature Loops. FIG. 23 shows this problem. Feature Edges are shown inthe shaded spots.

Hybrid Approach

The hybrid method will now be described. This method creates regionswith complete Feature Loops.

FIG. 24 shows the steps of the Hybrid method. As explained before,optimally triangulated meshes pose problems for the Watershedsegmentation method. To overcome this, all of the feature edges in themesh are identified using a threshold (angle). First, a Feature Verticesare defined as vertices that make up a Feature Edge. The reverse is notnecessarily true; if both vertices of an edge are Feature Vertices, itdoes not automatically qualify the edge as a Feature Edge.

FIG. 25 illustrates the creation of triangles on feature edges of anoriginal mesh.

Step 1—All Feature Vertices are first defined based on a threshold angle(dihedral angle) as explained above.

Step 2—Next, new vertices are added. Vertices are added to edges of alltriangles that have all three vertices as Feature Vertices. See FIG. 25.The new vertex is added at the mid point of each edge. We then connectthem as shown to create four new triangles. If the triangle has threeFeature Edges, then the center point of the triangle is added and sixnew triangles are created as shown in FIG. 25B. Addition of verticesrequires fixing of the topology as illustrated in FIG. 25C. The triangleshown is a neighbor of the triangle to which we added the vertices. Thiscan lead to a hanging vertex problem. To fix this, we connect the newvertex with the vertex on the opposite edge to create two new triangles.As a result of the above, we have two kinds of new vertices; those thatlie on the Feature Edges (labeled FV high) and those that do not(labeled as FV low). The reason for labeling is explained below.

Step 3—Watershed Segmentation

Next, we apply Watershed segmentation to our modified mesh. We useabsolute curvature for our examples. Other curvature estimation methodsmay be used. Feature Vertices FV high are assigned the label of maximumcurvature. Since they lie on a Feature Edge, assigning them highcurvature ensures that a Feature Edge will be preserved as a hard edge.The rest of the vertices in the mesh follow the same procedure asdescribed in the Watershed algorithm for computing curvatures at thevertices. The Feature Vertices contain their own region labels as wellas labels of the neighboring vertices. The addition of vertices has animpact on the Descent and Region Merge operations of the Watershedprocess [9]. This is done to solve the \no hard boundary” problem whichhas been described and discussed before in this paper.

Step 4—Removing Vertices Added in Step 2

To restore the mesh to its original form we must remove the verticesadded to the mesh in step 2 above. This process restores the topology ofthe mesh also.

Step 5—Collating Triangles into Regions

The goal of this step is to assign triangles and not vertices todifferent regions. This is achieved as follows:

1. Case: All vertices have the same label

This is simplest of the cases. The triangle is assigned the region labelof its vertices.

2. Case: One vertex has a single label

This is the case when one vertex has a unique label but the other twovertices have multiple labels. The triangle is assigned the region ofthe vertex with single label.

3. Case: Multiple labels but only one common label.

The three vertices of the triangle have multiple labels each, however,there is only one label that is common. The triangle is assigned theregion label that is common to the vertices in the triangle.

4. Case: All edges are Feature Edges

The triangle qualifies as a region by itself and gets assigned a uniqueregion identifier.

5. Case: Multiple labels and multiple common labels

It is possible that the each vertex of a triangle has multiple labelsand there is more than one common label. To explain this we will use theexample in FIG. 26. Triangle T1 has vertices with region labels R1 andR2. One of the vertices also has the label R3. First, we check if aneighboring triangle shares a Feature edge with T1. In this case, wefind T2 shares common Feature edge with T1. We then compare the commonvertex labels of Ti (R1 R2) with common label(s) of T2 (R1). The labelthat does not belong to the set of common vertex label(s) of theneighboring triangle is assigned to the targeted triangle (R2).

If the targeted triangle has more than one feature edge, then the aboveprocess is repeated for each neighboring triangle that shares a featureedge 2. Each such neighbor contributes one or more labels. Then T1 isassigned the label that is common between the contributed labels. Thetargeted triangle may have no feature edge. It means that the triangleis in the same region as the regions of the neighboring triangles. So,the region label of any neighboring triangle is selected as the label ofthe targeted triangle. 2 Note that if all three edges are feature edges,then it is dealt under Case 4.

Results and Conclusions

The hybrid method relies on the Watershed method and additionally usesadvantages of the dihedral angle method. The hybrid method doessegmentation of smooth objects as well as mechanical objects. It alsosolves the “no hard boundary” problem. FIG. 27 shows two mechanicalparts. FIG. 27A is a gear and FIG. 27B is a housing. FIG. 28A shows aturbine. Notice that each blade has been segmented into a differentregion. FIG. 28B is a wheel (rim and tire). It is segmented into thefollowing regions i.e. the tire, rim, bolt holes and a hole for mountingthe wheel. These are correctly segmented. All examples are from publicdomain data sets.

FIG. 29 shows the bone data (from FIG. 23) segmented using the hybridmethod. We have pointed out the shortcomings of the most popularsegmentation approaches for triangle meshes and devised a hybrid methodthat overcomes these shortcomings. Future work would involve automaticselection of threshold values both for watershed and dihedral angles.

Data Compression Algorithms

As previously described, according to one aspect of the invention, thedata compression module 121 compresses modeled data for enhanced storageand transmission. Two advantageous compression methods are (1) trianglemesh compression using B-spline curves and (2) compression using asubdivision surface scheme. Each of these compression methods will nowbe discussed.

Triangle Mesh Compression Using B-Spline Curves

As more sophisticated objects are modeled or digitized, their trianglemesh representation becomes more complex and hence the size of such dataincreases considerably. Due to their large size, these files aredifficult to transfer and take a long time even for viewing. TriangleMesh Compression techniques exploit the geometric properties of the meshrepresentation to encode them into smaller files. A novel losslesscompression scheme using B-Splines is described.

A triangle mesh representation consists of information about geometryand connectivity, also known as topology. Geometry defines the locationof vertices in a (Euclidean) coordinate system. They are represented astriplets (x, y, z). Connectivity defines the sets of points that areconnected to form triangles or faces of the mesh. Triangles are given bythree index values, which identify the three vertices bounding thetriangle.

Geometry information is usually compressed by quantizing the vertices.Based on the number of bits chosen to represent the floating pointnumbers, the location of the vertex is converted to integers on a grid.Note that even though there is loss in the geometric accuracy due toquantization, it is still referred to as Lossless compression. Losslessrefers to the connectivity i.e. the triangle mesh is reproduced afterdecompression topologically exactly the same as the input mesh. Asuitable prediction scheme is used and the displacement between thepredicted and actual point is stored as an error vector. When theprediction is accurate, the error vectors are very small and hence canbe stored using fewer bits. Connectivity compression is achieved bydesigning a traversal scheme, which encodes the unique path taken duringmesh traversal, using predefined rules and tables.

The present novel compression scheme uses B-Splines and is loosely basedon the ideas presented in the Edgebreaker algorithm described in J.Rossignac, Edgebreaker: Transactions on Visualization and ComputerGraphics, 1999; J. Rossignac and A. Szymczak, triangle meshesconmpressed with Edgebreaker, Computational Geometry, Theory andApplications, 14(1/3), 119-135, November 1999; and J. Rossignac, A.Safonova, and A. Szym, Edgebreaker on a Corner-Table, Conference, Gemoa,Italy. May 2001 (Edgebreaker). A brief description of the Edgebreakeralgorithm follows.

A “seed triangle” is randomly picked from the mesh. Starting from it,the mesh is traversed one triangle at a time, based on rules oftraversal. The edge through which the seed triangle is entered is calleda “gate.” The path taken is written as a combination of codes. When atriangle is entered, it is marked as visited; all of its vertices arealso marked as visited. As we move from one triangle to another, thefarther vertex of the entered triangle is predicted based on the threevertices of the first triangle, using parallelogram prediction (FIG.30). The difference between the predicted vertex and the actual vertexis then stored as an error vector.

The error vectors are stored only for vertices which have not beenvisited yet. Hence, after the traversal, the number of codes would beequal to the number of triangles and there is exactly one error vectorcorresponding to each vertex.

The topology is encoded as a series of codes corresponding to the meshtraversal, and the geometry is encoded as error vectors. Using thisinformation, the original mesh is reconstructed during decompression.

During our tests we noticed that topology encoding makes up for roughly10% of the compressed file (compressed using Edgebreaker). The challengethen is to reduce the compressed file size related to the geometry.Hence there is a need for a better prediction algorithm than theparallelogram method. We experimented with B-Splines. The dual benefitof using our method are that even though we use the mesh traversalmethod of Edgebreaker during compression we do not store any codes, and,the error prediction is improved. The connectivity is the automaticresult of the algorithm. First we describe the compression.

Compression

The B-Spline compression scheme uses the same traversal rules asdescribed in Edgebreaker. However in our method, when we move from onetriangle to the neighboring one, the mid point of the common edge isrecorded. Once all the triangles are visited, many sequences oftriangles, called triangle strips, are obtained. Each such strip has acorresponding sequence of edge mid points, to which a B-Spline curve isfit using least squares approximation. The order in which the B-Splinecurves are generated is recorded (FIG. 31). The parameter value,corresponding to each midpoint on the curve, and the error vector, whichis the difference between the actual position of the midpoint and theone computed using the parameter value, are also recorded.

The parameter value is stored only if the vertex opposite to the gatehas not been visited yet. If it is marked as visited due to thetraversal of an adjoining triangle earlier in the compression process,the index of the edge that is shared with that adjoining triangle isstored for referencing it during decompression. This avoids storing thealready visited vertices multiple times.

Decompression

After the compression process, we obtain an ordered set of B-Splinecontrol points, the parameter values corresponding to the midpoints ofedges and the error vector associated with each midpoint. The twovertices forming the gate edge are also given.

During decompression, the B-Spline curves are evaluated at their set ofparameter values in the order in which they were created. Each pointthus evaluated on the curve is then displaced by the corresponding errorvector to generate a mid point (FIG. 32). This mid point, along with oneof the two points obtained during the previous step (during the firststep, they would be the gate vertices) is used to determine the thirdvertex forming the triangle.

Points that are revisited store a reference to the edge common betweenthe current triangle and the adjoining one visited previously. This edgeis used to locate the actual point, which is then used to construct thecurrent triangle.

Preliminary results obtained by compression using the B-Spline schemeare very encouraging. The control polygon points and the error vectorsare stored as triples of floating point numbers, each stored using 8bits. The number of bits used to store parameter values is dynamicallycomputed based on number of points on the B-Spline curve and istypically 5 bits. The B-Spline curves used for our experiments are ofdegree five and have uniform parameterization (FIGS. 33 and 34).

We note the following two differences from Edgebreaker algorithm. First,information about both geometry and topology is encoded into theB-Spline curves. Hence, unlike Edgebreaker, our method does not storecodes for topology. Second, the error vectors associated with mid-pointsusing B-Splines are about ten times smaller than the error vectorsobtained in Edgebreaker, which uses parallelogram prediction. On anaverage, the compressed file size is 10% smaller than that ofEdgebreaker.

Compression using a Subdivision Surface Scheme

Subdivision surfaces are finding their way into many Computer AidedDesign and Animation packages. Popular choices include Loop,Catmull-Clark, Doo-Sabin etc. Subdivision surfaces have many designadvantages over traditional use of NURBs. NURB surfaces always areproblematic when multiple patches meet.

Reverse engineering (RE) is associated with the idea of scanningphysical objects and representing the resulting dense cloud of pointswith mathematical surfaces. In RE the goal is to convert the dense pointof scanned points into a patchwork of NURB surfaces with most effortgoing into automating the process. With the emergence of subdivisionsurfaces as popular modeling tools, it only follows that a similarprocess be devised for this class of surfaces. Our paper looks atdeveloping one such method for Loop surfaces.

Given a dense triangular mesh, we would Eke to obtain a control mesh fora Loop subdivision surface which approximates the given mesh. Thisprocess benefits subdivision surfaces in animation and manipulation thatneed speed over accuracy with an ability to manipulate the control meshand to regenerate the smooth surface quickly. Like subdivision waveletsin a multi-resolution analysis, our method can perform level-of-details(LOD) with arbitrary topological meshes useful in applications requiringa fast transfer, less storage, and a fast rendering and interaction. Theresulting coarse control mesh is approximately 6.25% of the originalmesh therefore this method can also be used as a lossy compressionscheme.

Given dense unorganized data points such as a point cloud from a rangescanner, a triangle mesh can be constructed by various methods known inthe art. However, triangular meshes obtained are piecewise linearsurfaces. For editing, modeling, etc. dense triangle meshes are not anoptimal solution. Dense triangle meshes with lot of detail are expensiveto represent, store, transmit and manipulate. A tensor product NURBS(Non Uniform Rational B-Splines) and B-splines are the most popularsmooth surface representations. Considerable work has been done onfitting B-spline surfaces to three-dimensional points. This process isoften called the Reverse Engineering process wherein a digitalrepresentation of the physical object is created.

A B-spline surface is a parametric surface, and it needs a parametricdomain. Previously, some have proposed methods for parameterizing atriangular mesh and unorganized points respectively for a single surfacepatch A single B-spline patch can only model surfaces with simpletopological types such as deformed planar regions, cylinders, and tori.Therefore, it is impossible to use a single non-degenerate B-spline tomodel general closed surfaces or surfaces with handles. MultipleB-spline patches are needed for arbitrary topological surfaces; however,there are some geometric continuity conditions that must be met foradjacent patches. Therefore, using NURBS/B-splines is not the mostdesirable approach since it requires high-dimensional constrainedoptimization. A subdivision surface scheme such as Loops, does notsuffer from these problems, has compact storage and simplerepresentation (as triangle base mesh) and can be evaluated on the flyto any resolution with ease. It does not require computation of a domainsurface.

Similar in nature to subdivision wavelets, we would like to obtain acontrol mesh approximating original surface with small magnitude ofdetails (as a scalar function) needed to best reconstruct anapproximation of the original mesh (FIG. 35). The benefit of using ascalar-valued function is that its representation is more compact thanits traditional counterpart, a vector-valued geometry representation.One of our contributions is to obtain a control mesh with very smallmagnitude of displaced values (details) in order to have a very highcompression ratio while preserving surface details given a semi-regular(subdivision connectivity) of the original mesh.

Subdivision Surfaces

A subdivision surface is defined by a refinement of an initial controlmesh. In the limit of the refinement process, a smooth surface isobtained. Subdivision schemes previously have been introduced based onquadrilateral meshes. These schemes generalized bi-quadratic andbi-cubic tensor product B-splines. The triangular based subdivisionscheme was introduced by Loop which was a generalization of C² quartictriangular B-splines.

Remeshing

Remeshing means to convert a general triangle mesh and into asemi-regular mesh. It is called a semi-regular mesh because mostvertices are regular (valence six) except at some vertices (verticesfrom a control mesh not having valence six).

Semi-regular mesh can be used for multi-resolution analysis. Through theremeshing process a parameterization can be constructed and used inother contexts such as texture mapping or NURBS patches.

The overview of our method is as follows: an original mesh is simplifiedby a quadric error metric (QEM) based on Garland, M., and Heckbert, P.S. Surface simplification using quadric error metrics. Proceedings ofthe 24th annual conference on Computer graphics and interactivetechniques August 1997 to get a simplified control mesh. We decimate themesh to 6.25% of the number of original triangles. Based on Sizuki, H.,Takeuchi, S., and Kanai, T Subdivision Surface Fitting to a Range ofPoints. Computer Graphics and Applications, Proceedings. Seventh PacificConference, 1999, the control mesh is then adjusted such that after somelevels of subdivision the control mesh vertices lie close to theoriginal surface. The adjusted control mesh is then subdivided using theLoop's scheme Loop, C Smooth subdivision surfaces based on triangles.Master's thesis, University of Utah, Department of mathematics, 1987 fortwo levels obtaining subdivided mesh with its number of triangles closeto that of the original mesh. Based on Lee, A., Moreton, H., Hoppe, H.Displaced Subdivision Surfaces. Proceedings of the annual conference onComputer graphics (SIGGRAPH 2000), 2000, the subdivided mesh verticesare then displaced to the original surface. Our remeshing process isshown in FIG. 36. Most or all vertices, however, require displacementsduring this process even if the control mesh has been adjustedpreviously. With that in mind, we need to reverse the process of Loopsubdivision scheme to obtain a better control mesh with smaller andfewer displacement values (details). A flow chart of our process isshown in FIG. 37.

In the Loop subdivision, each subdividing level produces two types ofvertices: vertex points and edge points as shown in FIG. 38. Afterobtaining the subdivided and displaced mesh, the mesh is semi-regular(subdivision connectivity) and all vertices lie on the original surface.$M_{{({n + m})}{xn}}{\quad{\quad{{\begin{bmatrix}v_{1}^{i - 1} \\v_{2}^{i - 1} \\. \\. \\. \\v_{n - 1}^{i - 1} \\v_{n}^{i - 1}\end{bmatrix} = {{\begin{bmatrix}v_{1}^{i} \\v_{2}^{i} \\. \\. \\. \\v_{n - 1}^{i} \\v_{n}^{i} \\e_{1}^{i} \\e_{2}^{i} \\. \\. \\. \\e_{m - 1}^{i} \\e_{m}^{i}\end{bmatrix}\quad{M_{nxn}\begin{bmatrix}v_{1}^{i - 1} \\v_{2}^{i - 1} \\. \\. \\. \\v_{n - 1}^{i - 1} \\v_{n}^{i - 1}\end{bmatrix}}} = {{\begin{bmatrix}v_{1}^{i} \\v_{2}^{i} \\. \\. \\. \\v_{n - 1}^{i} \\v_{n}^{i}\end{bmatrix}\quad\begin{bmatrix}v_{1}^{i - 1} \\v_{2}^{i - 1} \\. \\. \\. \\v_{n - 1}^{i - 1} \\v_{n}^{i - 1}\end{bmatrix}} = {{M_{nxn}^{- 1}\begin{bmatrix}v_{1}^{i} \\v_{2}^{i} \\. \\. \\. \\v_{n - 1}^{i} \\v_{n}^{i}\end{bmatrix}}{Where}}}}},\quad{v\quad{is}\quad a\quad{vertex}\quad{point}e\quad{is}\quad{an}\quad{edge}\quad{point}i\quad{is}\quad{subdivision}\quad{level}M\quad{is}\quad a\quad{Loop}\quad{subdivision}\quad{matrix}}}}}$

The first reverse step applies to the mesh to obtain a coarser 1^(st)reverse subdivision mesh. We don't choose a subdivision matrix in FIG.39A because both vertex points and edge points are used in solvinglinear system of equation. The subdivision matrix is not a squarematrix, and we need to do least-square approximation via the normalequations to obtain the control vertices. If that has been done,displaced values would have been required for all vertices (both vertexpoints and edge points), and by doing that it defeats our goal of havingfewer number of required detail values (displaced values and errorvalues) and small magnitudes needed for high compression . Therefore,only the subdivision matrix for vertex points as shown in FIG. 39B isused instead. Here, the matrix is a square matrix and is invertibleassuming a vertex general position. We could solve for exact controlvertices. For large linear system, we use Gauss-Shedel iterativeapproach to solve the linear system. Now, we need to calculate displacedvalues and save them for a reconstruction phase. To get them, weinternally subdivide the 1^(st) reverse subdivision mesh one level. Thevertex points of this subdivided mesh (level 1) are about 25% of totalvertices, and they all have zero displaced values. The edge points mayor may not require displaced values. The percentage of edge pointsrequires zero displaced values as shown in Table 2. Note that thecomputation for displaced values for edge points at this level is tofind the distance along each vertex limit normal intersecting with theoriginal surface.

In the second reverse step, we again solve for the control vertices ofthe 1^(st) reverse mesh in similar- manner to that of the first reversestep. However, the error values (or what we call the displaced values inthe first reverse step) for the edge points are computed differently asshown in FIG. 38. The error values are computed as a dot product of edgepoint unit limit-normal and vector from its position (obtained fromsubdividing the solved control mesh) to its corresponding position(obtained from the 1^(st) reverse subdivision mesh). The end result is avery small coarse mesh plus some details using which one can approximatethe input mesh. The coarse mesh can be used as a model much like theNURBS/B-spline control mesh or the model and detail can be used as alossy compression scheme.

We reconstruct and compare a number of well-known data sets as shown inFIG. 40. After obtaining control meshes, we losslessly compress themusing software by 3D Compression Technologies Inc. to further reduce theoutput file size. For the encoding of details (error values anddisplaced values) we vary the quantization level from 8 to 12 bits perdetail value for comparison. Magnitude of Loop vertex limit normal isused as a scaling factor to further reduce the already small detailvalues. We compare the result between 8-bit and 12-bit detail encodingand found that by encoding with lower bits (8 bits), the reconstructedsurface is as good as that of the higher bits (12 bits). In addition,8-bit quantization would produce higher number zero details, which canbe further encoded using variable-length scheme. In the variable-lengthencoding, each detail value is either encoded by 1 bit for zero detailvalue or 9 bits otherwise. The ninth bit is to indicate the requireddetail encoding. As a result, our output surfaces have high fidelity oforiginal surface details as well as high compression ratio. Table 1below shows percentage of vertices requiring no displacedment values atdifferent levels. Comparison of the quantitative compression resultsamong 8-bit detail encoding, 12-bit detail encoding, and 9-bitvariable-length detail encoding (OPTZ) are shown in Table 3. TABLE 2Zero displacement at each level (with 8-bit quantization). % of no. ofvertex Total % of no. points (all vertex % of no. of edge of points Datapoints require zero points with zero requiring zero #V Level #Vdisplacement) displacement displacement Spock* Control mesh 1,039 16,3891 4,121 25.21 15.97 41.18 2 16,417 25.10 62.65 87.75 Igea* Control mesh2,102 33,587 1 8,398 25.02 8.86 33.88 2 33,586 25.00 20.47 45.47 Bunny+Control mesh 2,206 35,280 1 8,818 25.02 13.60 38.62 2 35,266 25.00 24.3449.34 Horse* Control mesh 3,032 48,485 1 12,122 25.01 55.02 80.03 248,482 25.00 72.15 97.15

TABLE 3 Quantitative compression results. Output Control mesh Size SizeCompress Input before after Details (displacement) (control #V Size #V3DCP 3DCP 8 bits 12 bits OPTZ mesh + Data #T (KB) #T (KB) (KB) (KB) (KB)(KB) OPTZ) Spock * 16,389 575 1,039 36.13 3.54 15.87 22.87 6.21 98.30%32,718 2,044 Igea * 33,587 1,181 2,102 73.82 5.85 31.18 46.18 27.1597.21% 67,170 4,198 Bunny + 35,280 1,240 2,206 77.5 6.16 32.5 48.5 26.7697.35% 70,556 4,408 Horse * 48,485 1,704 3,032 106.5 6.21 44.5 67.5 9.2599.09% 96,966 6,060Legend:Data with * courtesy of Cyberware; + courtesy of Stanford Univ. ComputerGraphics Lab#V = Number of vertices#T = Number of trianglesKB = Kilo-Bytes3DCP = Lossless compression by 3DCompress.com software (performed oncontrol-meshes)OPTZ = Optimized encoding scheme (variable-length encoding)

According to the novel method disclosed, we can approximate anyarbitrary topological boundary and non-boundary surfaces with highcompression and details. We have combined subdivision surface andscalar-valued displacement for our surface reconstruction. Our domainsurface is obtained in such a way that it is close to the originalsurface, the magnitude of displaced values tends to be very small and isfavorable as a compression scheme. With the high compression andfaithfully detailed surface, the method is can be used for applicationssuch as mesh compression, animation, surface editing and manipulation.

Curve Matching Algorithm

According to another aspect of the invention, the system can performcurve matching to compare the similarity or dissimilarity of two shapes.As used herein “shape” means a two or three-dimensional curve. One ofthe important characteristics of a curve is its curvature. Curvature isvery useful for analysis and classification of curve shape. We assume Aand B to be free form curves such that, either curvatures can becomputed, or these can be approximated by a function or a parametriccurve of degree two or higher. In 3D space the curvature of curves isunsigned. However, for planar curves in 3D space, we can representcurvature as a signed quantity.

A curve matching method according to the invention compares two curvesto obtain a measure, which decides the similarity on the analysis of thecurvature plots of the curves. The curvature plot of the curve withsmaller chord length is slid along the curvature plot of the longer one.The difference in areas of the curvature plots is obtained and theminimum difference value is attributed to the amount of match betweenthe curves.

The method of curve matching has been implemented on the profile curvesof pots, which profile curves are constructed by fitting a least squarescurve to the points obtained by intersecting a 3D vessel with a cuttingplane.

Curvature

The curvature of a curve measures its failure to be in a straight line.The faster the first derivative X′(t) turns along the curve X(t), thelarger is its curvature. Curvature is defined as the rate of change ofthe unit tangent vector X′(t). For a straight line, the curvature iszero as its second derivative vanishes. For a circle, the curvature isconstant. Let X(t)=(x(t), y(t), z(t)) be a curve, then the curvature att, k(t), is defined as $\begin{matrix}{k = {{k(t)} = \frac{{\overset{.}{X}\hat{}\overset{¨}{X}}}{{\overset{.}{X}}^{3}}}} & (1)\end{matrix}$

where X=dX/dt and X=d²X/dt² and “” denotes the cross product. Athree-dimensional curve has non-negative curvature. For planar curves,we may assign the signed curvature as, $\begin{matrix}{k = {{k(t)} = \frac{\det\left\lbrack {\overset{.}{X},\overset{¨}{X}} \right\rbrack}{{\overset{.}{X}}^{3}}}} & (2)\end{matrix}$

The points at which the curvature changes sign are called inflectionpoints. Therefore, a planar curve may have inflection points but, athree dimensional curve has none.

Curve Matching Algorithm

Given two curves as input, the curve-matching method according to theinvention computes a measure that determines the similarity between thecurves. For planar curves in 3D space, the curvature plot of the curvehaving the smaller chordlength is slid along the curvature plot of theother curve. The difference in areas of the curvature plot is computedby an integration routine such as the Trapezoidal rule or Simpsons rule.The minimum difference in area is noted and this, along with a weightedcomponent of the difference in chordlengths of the curves, produces ameasure which determines the similarity between curves.

Given two curves A and B to match, the first step is to evaluate thechordlengths of the curves. The two chordlengths are compared and theminimum of the two is used to select the length of the unit interval.What this means is that, the two curves are discretised into intervalsof the same length. The selection of the unit interval length determinesthe precision of the result. The greater the number of discreteintervals, the more accurate is the similarity metric. $\begin{matrix}{\delta = \frac{\min\left( {l_{A},l_{B}} \right)}{N}} & (3)\end{matrix}$

where, l_(A) is the chordlength of curve A, l_(B) is the chordlength ofcurve B and N is the number of discreet intervals.

The parameter values of the two curves are then recomputed with respectto their chordlengths. To find the new parameter value, the followingprocedure is used:

A graph is plotted with the chordlength as the X-axis and the parametervalue as the Y-axis. At discrete intervals of the curve, thecorresponding parameter value is computed by interpolating thecorresponding graph points. This procedure ensures that the curvatureplot of the curve is drawn with respect to the chordlength.

After obtaining the new parameter values of the curves, the firstderivative and second derivative are computed at the discretised points.They are used to compute the curvature at these points. These are thecurvature function ƒ_(A) and ƒ_(B) of the curves A and B respectively.

The curvature plots of the two curves are then analyzed to measure thesimilarity between the two curves. The minimum difference of the twocurvature functions determines the measure for curve matching.$\begin{matrix}{{{Area}_{\min} = {\min_{a}{\int_{a}^{a + l_{A}}\left\lbrack {f_{A} - f_{B}} \right\rbrack^{2}}}}\quad} & (4)\end{matrix}$

This can be viewed as sliding (this is discreet sliding) the curvatureplot of the curve with smaller arc length along the curvature plot ofthe other, and the difference in curvature plot areas is calculated atall positions. Since, the curves and their curvature plots are indiscrete form, with the same index representation, this is a matter offinding the area by trapezoidal rule. Different methods of computingthis difference in area can be used, such as the trapezoidal rule,Simpsons rule, and Riemann sum. The trapezoidal is the simplest. Forbetter results a more appropriate method may be used. The minimumcurvature area difference (Area_(min)) and its position (a) are notedfrom the above process.

Let (l_(A),ƒ_(A)) and (l_(B),ƒ_(B)) be the lengths and curvaturefunctions of the two curves A and B. Then, the total difference (d) willbe, $\begin{matrix}{d = \sqrt{{s_{1}\left( {l_{A} - l_{B}} \right)}^{2} + {s_{2}{\min_{a}{\int_{a}^{a + l_{A}}\left\lbrack {f_{A} - f_{B}} \right\rbrack^{2}}}}}} & (5)\end{matrix}$

In the above formula s₁ is a factor contributing to the difference inchordlengths of the curves and s₂ is the factor contributing to theirshapes (curvature). A weighted combination of shape and size gives agood metric to match the curves. A percentage match value (permatch) canbe computed using the following formula: $\begin{matrix}{{permatch} = {\left( {1 - \frac{d}{f_{A}}} \right)*100}} & (6)\end{matrix}$

It can be observed that the percentage match is always between 0 and100. The percentage match can be used as an important factor whenquerying a database for similar shapes. The best case occurs when acurve is compared with itself. In that case, the difference in thechordlengths is zero and the difference in the curvature plot area iszero. The value of the metric is zero. This is a perfect match: A=B. Inother cases, when a curve A is compared with another curve B having adifferent chordlength and curvature plot, the difference in thechordlengths is positive and the difference in the areas of thecurvature plot is also positive. The value of the metric is positive.The metric determines the amount of similarity.

The different combinations of the weights s1 and s2 in the aboveequation leads to different types of matches, i.e. partial matches,exact matches and shape matches. Variations in s₁ and s₂ and each ofthese different types of matches will now be described.

The first type curve matching that will be described is the partialmatch. Given two curves A and B, there is always a possibility that onecurve is a part of the other. But where exactly do they fit each other?Sliding one curve over the other is quite computationally intensivebecause the orientation of the curves may differ. By using the method ofcurvature plots, one can slide the curvature plot of one curve over thecurvature plot of the other curve, which is similar to sliding one curveover the other. Since the method uses curvature, which is transformationinvariant, the problems caused by orientation of curves are avoided. Fora partial match of curves, the shape and size of the curves areconsidered. In a partial match, the chordlengths of the two curves mayor may not be equal. The curvature plot of the smaller curve is slidalong the curvature plot of the bigger one. The difference in area iscomputed using the trapezoidal rule. In a partial match no weight isattributed to the difference in chordlengths. The minimum difference inarea and its corresponding position are noted. This is the best possiblepartial match. A threshold can be taken as a user input, whichdetermines how close the curves should be. The metric for curvesimilarity is as follows: $\begin{matrix}{d = {\min{\int_{a}^{a + l_{A}}\left\lbrack {f_{A} - f_{B}} \right\rbrack^{2}}}} & (7)\end{matrix}$where d is the measure for curve matching and ƒ_(A) and ƒ_(B) are thecurvature functions of the two curves A and B, respectively. If thevalue of the measure d=0, then A=B. If the value of the measured≦threshold, then A⊂B.

The next type of curve matching that will be discussed is the exactmatch. In the case of partial matches it was observed that the curvatureplots and their analysis are a better tool to determine the similaritybetween curves. However, in partial matches, the difference in thechordlengths of the two curves was not taken into consideration. In theexact match type of matching both the shape and the size of the curvesare taken into consideration while determining the similarity betweenthem. The curvature plot sliding is used as the main tool here todetermine similarity, hence, making this method transformationinvariant.

In the exact match of curves, the chordlengths of the two curves may ormay not be equal. Therefore, the curvature plot of the smaller curve isslid along the curvature plot of the bigger curve and the minimumdifference in area is calculated by the trapezoid rule and thecorresponding position noted. This position is the best possible partialmatch. A weighted difference of the chordlengths is added to obtain ametric for curve similarity as follows: $\begin{matrix}{d = \sqrt{{{s1}\left( {l_{A} - l_{B}} \right)}^{2} + {s_{2}\min{\int_{a}^{a + l_{A}}\left\lbrack {f_{A} - f_{B}} \right\rbrack^{2}}}}} & (8)\end{matrix}$where, d is the measure for curve matching, s₁ is the weight attributedto the difference in chordllengths, s₂ is the weight attributed to thedifference in the areas of the curvature plots, l_(A) is the length ofcurve A, l_(B) is the length of the curve B, ƒ_(A) and ƒ_(B) are thecurvature functions of the two curves A and B respectively. The value ofthe measure d determines the similarity between the curves.

The third type of curve matching is the shape match. Matching two curvesby their shape must be size independent. For example, one may want tocompare the similarity between a smaller circle and a bigger one. Inthis case, the shape matching metric should show a perfect shape match.Thus, to perform a shape match, the input curves are pre-scaled to thesame size. The scaling is done with respect to the chordlength of thecurves. The process of scaling is as follows:

Given two curves A and B as inputs, the first step is to compute theirchordlengths. Let l_(A) be the chordlength of the first curve and letl_(B) be the chordlength of the second curve. Let s be the scalingfactor. Mathematically, s is defined as $\begin{matrix}{s = \frac{l_{A}}{l_{B}}} & (9)\end{matrix}$

The curves are initially discretised and the curvature values arecomputed at the discretised points. The curve with smaller chordlengthis now scaled by the scaling factor s, which essentially means that thechordlength values of the smaller curve are scaled by s and thecurvature values are scaled by 1/s; After the process of scaling iscompleted, the difference in areas of the curvature plots is computed byone of the integration routines previously discussed. This difference inarea is the metric determining the similarity between the two curves.The metric for curve similarity is as follows: $\begin{matrix}{d = {\min{\int_{0}^{a}\left\lbrack {f_{A} - f_{B}} \right\rbrack^{2}}}} & (10)\end{matrix}$

If the value of the measure d=0,then curve A and curve B are of similarshape. Thus, two similar curves with different sizes be compared forshape using this method.

The foregoing method obtains a measure for matching two curves. Themethod provides an estimation of the shape similarity by the property ofcurvature of each of the curves. The difference in areas of thecurvature plots of the two curves is estimated and this measure is usedto determine the similarity. If the value of the measure is small thenthe shapes are similar and if it is large, they are dissimilar.According to the invention, this curve matching method described abovecan be used to query data sets for matching curves.

EXAMPLE USES

The following examples serve to illustrate certain presently preferredembodiments and aspects of the invention. These examples are not to beconstrued as limiting the scope of the invention.

Three of the examples describe projects that involve archaeological andbiological material, namely ceramic vessels, lithic artifacts, andbones. These projects are: (1) “3D Morphology of Ceramic Vessels” whichhas been used for archiving and searching information about thestructure of ceramic vessels; (2) “Lithic Refitting” which has been usedto (partially) automate the refitting process through 3D scanning andsurface modeling; and finally (3) “3D Topography of Joint Surfaces”which has been used to automate segmentation of osteological featuresand quantify the surface area and curvature of joint surfaces and thecongruency between reciprocal joint surfaces, allowing for thedevelopment of biomechanical models. Common to all of these projects arethe following aspects of the invention: geometric modeling, featurerecognition, and the development of a database structure aimed at making3D models of these artifacts available online for query.

Example 1 3D Morphology of Ceramic Vessels

3D knowledge plays an important role in archaeology. For example,archaeologists study the 3D form of Native American pottery tocharacterize the development of cultures. Conventionally, vesselclassification is done by an expert and is subjective and prone toinaccuracies. The measurements are crude and in some case eyeballing isthe method of choice.

As part of the 3DK project at Arizona State University in Tempe, Ariz.,a system according to the present invention has been developed as apilot project for the study of the 3D morphology of prehistoric NativeAmerican ceramic vessels. The aim of this pilot project is to learnabout vessel uniformity, and variation with respect to function asindicators of developing craft specialization and complex socialorganization among prehistoric Native American cultures. To accuratelycapture the complex curvatures and vessel form and size variation, theuse of metric rulers and visual inspection of 21) profiles isinadequate. Therefore, for the study of prehistoric pottery traditions,scalable, visual, and quantitative comparisons of curvatures have beendeveloped according to the invention.

Archaeological vessels were scanned and defined as a set ofthree-dimensional triangulated meshes composed of points, edges andtriangles. The original data was then modeled with parametric surfaces,extracting features to raise the level of abstraction of data andorganizing vessel data based on XML schema. A Web-based visual queryinterface permits users to sketch or select sample vessel shapes toaugment text and metric search criteria to retrieve original and modeleddata, and interactive 2D and 3D models.

Key identifying features of the vessels were identified to supportsearch and data retrieval, and a catalog of terms was developed todescribe these features within the context of anthropologicaldescription, cataloging standards, and emerging metadataclassifications. Mathematical models were developed to translate thesefeatures into measurable descriptions. Software was developed to extractfeatures from the vessel data and to raise the level of abstraction ofdata. An additional result is generation of vessel measurements far moreaccurate than has been possible using the traditional tools ofanthropology.

Shape information is obtained from scanned three- dimensional data ofthe archaeological vessels, using 2D and 3D geometric models torepresent scanned vessels, extracting feature information from geometricmodels and storing the feature information in a database for Web-basedretrieval. A Web-based Visual Query Interface (VQI) is used to archiveand search vessels.

Based on 3D scanned data, 2D measurements (height, rim diameter, minimumdiameter, maximum diameter) and 3D measurements (area, volume, volume ofwall) were generated, and a measure of symmetry was developed. The pilotproject used a control collection consisting of two Mexican flower pots,two Mexican mold-made pifiata pots, and three hand-built Tarahumara jarsto verify the validity of the formulas generated. The prehistoriccollections studied consist of a total of 87 Native American vesselsfrom the Tonto Basin, Arizona, and Casas Grandes in northern Mexico. Thecontrol study and the two sets of prehistoric vessels are used togenerate a quantitative analysis of vessel uniformity and symmetry. Theautomated measurements are important improvements, which facilitate thequantitative assessment of ceramic uniformity and standardization,indicators of the development of craft specialization, differentiationof labor, and the development of complex forms of social organizationamong prehistoric cultures.

For data acquisition, each vessel was scanned using a laser scanner tocreate a 3D scan of a given vessel. The data acquisition device 130included two scanners. The first is a Cyberware Model 15 scanner, whichis a mobile scanner having a width of about 75 cm and is particularlyuseful for smaller objects. The second laser scanner is a CyberwareModel 303ORGB/.NS scanner, which is large stationary scanner allowingthe capture of large objects. Both of these scanners are marketed byCyberware, Inc. of Monterey, Calif. Both scanners are equipped with aturntable on which the object to be scanned is mounted. Resolution andaccuracy of the scans are a result of the distance between the scanninghead and the object. The large scanner is therefore less accurate thanthe small scanner. Specifics on accuracy and resolution of the laserscanners are available from Cyberware, Inc. While individual scans takeonly 17 seconds, the total average scanning time for each vessel isabout two hours, depending on complexity, color, texture, etc. Thescanning produces data as a highly dense “point cloud” of informationthat visually describes, but does not physically represent the vessel.

The vessel data was modeled with parametric surfaces by overlaying athree-dimensional triangulated mesh onto the point cloud data to definethe vessel as a set of three-dimensional triangulated meshes composed ofpoints and triangles. Post processing included filling of holes (due toscanner “oversight”), removing noise, etc. A Ball Pivoting algorithm wasimplemented to convert point cloud data sets into triangle meshes. Insome cases light decimation was performed to reduce the density withoutlosing accuracy of the overall structure. The result is valid point setand topology data for each scanned vessel. The result is a model of thevessel that is composed of parametric surfaces with physical, measurableattributes.

Mostly archaeological vessels are (approximately) surfaces ofrevolution, and studying contour shape will suffice to gather shapeinformation about the whole object. According to archaeologicaldefinition, there are four kinds of feature points on profile curves tocalculate dimensions and proportions of vessels. They are End Points,Points of Vertical Tangency, Inflection Points and Corner Points foundon the vertical profile curve of a vessel. End Points (Eps) are pointsat the rim (lip) or at the base (i.e. top and bottom of vessels). Pointsof Vertical Tangency (VTs) are points at the place where is the maximumdiameter on spheroidal form or minimum diameter on hyperbolic form.Inflection Points (IPs) are points of change from concave to convex, orvice versa Corner Points (CPs) are points of sharp change on a profilecurve. FIG. 43 illustrates these four kinds of feature points of vesselprofile curves.

Four features are common to all vessels, i.e. orifice, rim, body andbase. Orifice is the opening of the vessel, or the minimum diameter ofthe opening, which may be the same as the rim, or below the rim. Rim isthe finished edge of the top or opening of the vessel. It may or may notbe the same as the orifice. It may have a larger diameter. Body is theform of the vessel below the orifice and above the base. Base is thebottom of the vessel, portion upon which it rests, or sits on a surface.The base may be convex, flat, or concave, or a combination of these.FIG. 44 illustrates four kinds of bases, i.e.: (a) convex base; (b) flatbase with zero curvature; (c) concave base; and (d) composite base.

From the foregoing definition for characteristic points and commonfeatures for all vessels, feature representation of vessels wasformalized as follows:

<Point Feature>:=<End Point Feature>|<Point of Vertical TangencyFeature>|<Inflection Point Feature>|<Corner Point Feature>;

<Curve Feature>:=<Rim Curve Feature>|<Orifice Curve Feature>|<Base CurveFeature>;

<Rim Curve Feature>:=<End Point Feature><End Point Feature>;

<Orifice Curve Feature>:<Corner Point Feature><Corner Point Feature>;

<Base Curve Feature>:=<End Point Feature>|<End Point Feature><End PointFeature><Region Feature>:=<Neck Region Feature>|<Body RegionFeature>|<Base Region Feature>;

<Neck Region Feature>:=<Rim Curve Feature><Orifice Curve Feature>;

<Body Region Feature>:=<Orifice Curve Feature><Base Curve Feature>;

<Base Region Feature>:=<Base Curve Feature>;

<Volume Feature>:=<Unrestricted Volume Feature>|<Restricted VolumeFeature>.

XML was used to represent information of vessels. An XML schema, asshown in FIG. 45, was designed to represent geometric information,feature information and measured value of archaeological vessels.Feature information is extracted from geometric information and isorganized according to the feature formalism in the XML schema. Alsofeature information is used to index vessels stored in the database. TheXML schema for a sample vessel with values is shown in Appendix 1.

As discussed above, the result of the 3D laser scan of the vessel andinitial processing is a polygonal mesh composed of faces, edges andvertices and their connections. Surface models are generated from thescattered points of this polygonal mesh by least squares fitting and/orrotating profile curves. B-Spline surfaces are used to represent thesesurface models. The B-Spline models are used for model rebuilding, erroranalysis, closing holes and gaps found on the archaeological artifacts,and measured value getting.

The mathematical modeling algorithms described herein were used to passsurfaces through scanned point cloud data to generate measurable datafrom these relatively small, diverse data sets. Surface and volumemodeling algorithms were applied to model and generate quantitative,descriptive data about the artifact. The data and processes developedgrew from and are consistent with the descriptive vocabulary of ceramicsresearchers in Anthropology. The level of accuracy in documentation andmeasurement of the artifacts far exceeded traditional techniques used inthe field. The binary and derived data about the vessel provide a recordthat can be re-analyzed in the future and provide a tool for researchwithout physical access, at remote locations, or after repatriation ofthe vessel. FIG. 41 shows two examples of polygon meshes with watersheddefined areas. FIG. 41A shows a complete vessel and FIG. 41B shows apartial vessel. FIG. 42 illustrates feature segmentation of a complexshaped vessel.

Archaeologists analyze vessels by defining various regions using aprofile. The profile curve is obtained by passing a vertical planeperpendicular to the base through the vessel. Typically profile curvesare sketched free hand, often by tracing and duplicating half of thevessel to create a symmetric, but not necessarily accurate,representation. In this research the polygonal mesh model is used togenerate a much more accurate profile curve than has been previouslypossible. The resulting profile curve is processed to remove noise dueto scanning error at the rim. Vessels are initially cataloged into fourbroad shape categories—simple, composite, inflected, and complex.

A segmentation schemes based on curvature, as described herein, was usedand therefore several curvature estimation techniques were used. Therobustness of segmentation has been improved by increasing the accuracyof the curvature estimates. Due to the accuracy required for goodsegmentation, computing curvature is a complex, non-trivial task. Inthis research, multiple curvature estimation schemes were used andcompared. The estimation schemes can be broadly classified into twocategories—discrete and continuous. Discrete schemes extract curvaturedirectly from the geometry of the mesh by estimating the local deviationof the mesh from a flat surface. Continuous schemes first approximatethe mesh vertices locally with a polynomial surface, allowingcalculation of various forms of curvature from the resulting analyticsurface by methods of differential geometry.

The original scanned data, and the modeled and calculated data have beenlinked to existing record sets containing the traditional descriptivedata about location, provenance, and collection of the vessels. The XMLschema using a metadata schema derived from the Council for thePreservation of Anthropological Records (COPAR). Was used to catalog andorganize the 2D and 3D vessel data. The schema defines data elements fora given artifact, and links data across multiple databases was developedand implemented to support research queries. The 2D data from existingdatabases can be incorporated by using the schema to translate and linkthe search processes with databases housing the 3D data. The projectprovides access to data from the existing ceramic vessel databases andadditional spatial and volume data acquired for scanned vessels.

To provide efficient access, a scanned vessel database was developed.The database is structured to house the original binary data files,modeled files, derived measurement, features, and other descriptivedata. A master pot identification number is used as the key to linkthese data elements with additional vessel data in existing databases.This structure permits additional databases to link to the query engineby adding a field with the master pot ID to each record set, anddeveloping an XML schema and DTD to link related data fields between thedatabases. It is scalable as a proof of concept, is consistent withDublin Core cataloging structures, and requires minimal coding toprovide access to additional databases.

A visual query interface as described above was used to permit users tointeract with the data using sketches or by selecting sample vesselshapes to augment text and metric search criteria to retrieve originaland modeled data, and interactive 2D and 3D models.

This project successfully implemented a powerful system of 3D modelingand analysis techniques to provide descriptive data and support researchinto vessel shape and structure. It is interesting to note that evenmeasurements that are relatively coarse from a computer science modelingperspective offer significant improvements in accuracy for ceramicresearchers. The ability to search and compare these accurate models ofvessels offers new tools to ceramic vessel researchers.

Example 2 Lithic Tool Manufacture and Refitting

The goal of the lithic refitting pilot project is the reconstruction ofstone tool manufacturing activities within prehistoric archaeologicalsites and related cultural behaviors by identifying sequences ofconjoining lithic artifacts. Refitting lithic artifacts by manual trialand error is an accurate, but highly labor-intensive, method thatrequires the entire sample of artifacts to be present in a single lab.These conditions are not always possible given various antiquitiesrestrictions. Automated 3D surface matching of conjoinable artifactswill dramatically enhance the efficiency of this valuable analyticmethod and extend the scope of searches to collections from differentsites. Lithic refitting, or assembling the pieces of stone, broken apartby prehistoric people, has proven very useful in our attempt to betterunderstand the prehistoric past. The technique is especially useful intechnological studies, taphonomic research, and spatial analysis. The3DK project is developing software that would reduce the time cost ofrefitting and facilitate development of an electronic database storing3D images of lithic collections available for study by researchers allover the world. FIG. 46 describes the XML schema used for lithics.

Example 3 3D Topography of Joint Surfaces

The objective of the “3D topography of joint surfaces pilot project” isto further the understanding of the abilities, limitations, andadaptations of our early human ancestors to make tools and walk uprightby developing biomechanical models of manipulative and locomotorbehavior using 3D osteological data. Use of calipers and visualinspection are inadequate to capture the complex curvatures of 3D jointsurfaces and to control for body size differences in cross-speciescomparisons. This can be overcome by developing scalable, quantitativemodels of reciprocal wrist, hand, and knee joint surfaces that willallow for comparative quantative analysis of the effect of surface area,curvature, and congruency on joint mechanics in extant and fossil apesand humans.

Since the project inception, more than 600 bones representing the wristand hand joints of humans, chimpanzees, gorillas, orangutans, andgibbons have been digitized to create a database that will eventuallyinclude approximately 1000 bones representing the wrist, hand, and kneejoints of humans and apes. With an aim to better understand thefunctional morphology of joint surfaces, the segmentation of featuresfrom a bone that are of particular interest to a physical anthropologistsuch as joint surfaces may be automated, avoiding manual digitization ofsuch features that is both time consuming and labor intensive. Thesurface areas and curvatures of joint surfaces and the congruenciesbetween reciprocal joint surfaces are then quantified and analyzed.Static and dynamic models can be built using digitized osteologicaldata, together with musculoskeletal data from cadavers, and manipulativeand positional behavioral data, to analyze the mechanics of manipulativeand locomotor behavior. FIG. 47 shows the schema used for a bone.

Feature Extraction from Volume Data Volume Segmentation Using WeibullE-SD Fields

According to another advantageous aspect of the invention, regions ofvolume data can be extracted for exploring the inner structure of thevolume data by performing segmentation of volume data. Many tasks involume visualization involve exploring the inner structures of volumedata. For example, a cell biologist may be interested in the structureof the microtubule spindle apparatus in an egg. The rapid increase indata set sizes required for collecting images around the spindleapparatus, as well as the poor signal to noise ratio in the data setmake it difficult to extract geometric features efficiently.

Segmentation of volume data is a process of voxel classification thatextracts regions by assigning the individual voxels to classes in such away that these segmented regions posses the following properties: (1)voxels within the same region are homogeneous with respect to somecharacteristic (e.g., gray value or texture); and (2) voxels ofneighboring regions are significantly different with respect to the samecharacteristic.

The input data is on a 3D structured grid of vertices v(i,j,k), eachassociated with a scalar value. A voxel is considered as a κ×κ×κ cube,and each voxel is assigned two values: expectancy and standard deviation(E-SD). The Weibull noise index is used to estimate the noise in avoxel, and to obtain more precise E-SD values for each voxel. Thesegmentation method has been tested using synthetic data as well as realvolume data from a confocal laser scanning microscope (CLSM). Analysisof this data shows distinct and defining regions in their E-SD plot.Under the guide of an E-SD plot, an object embedded in real andsimulated 3D data can be efficiently segmented.

According to the volume segmentation method, the input data is on a 3Dstructured grid of vertices v(i,j,k), each associated with a scalarvalue, and a voxel is considered as a cube including κ×κ×κ 3D structuredpoints, called a K-voxel. Each K-voxel is assigned two values:expectancy and standard deviation (E-SD). The expectancy in a voxelrelates to its mean, and the standard deviation indicates thevariability of the data within it. It is assumed that the E-SD values ofvoxels in a region are relatively homogeneous and different from that inother regions. Many voxels have the same E-SD value. If one plots thefrequency of voxels tht have the same E-SD, then some areas in E-SDdomain will be dense and some sparse. This plot is called the E-SD fieldof the volume data. As will be apparent to those of skill in the art,for a given volume data, the E-SD field depends on the size of κ-voxelsselected, i.e., the value of κ.

A simple and efficient way to calculate the E-SD is to compute itsaverage and the sample standard deviation. However, noise makes itdifficult to calculate the E-SD values accurately. Under this situation,the result of the E-SD plot is not stable and is dependent on astatistical model of the data. A number of statistical frameworks havebeen proposed to model image and volumetric data. The observed imagepixels have been modeled as Rayleigh distribution random variables withmeans depending on their position. A Gaussian-function was used forpixel relaxation labeling . Others instead proposed an exponentialfamily of functions including Gaussian, Gamma, Rayleigh, and Possion toperform segmentation on 2D.

According to the present method, a Weibull probabilistic framework forsegmentation of Confocal Laser Scanning Microscope (CLSM) volume data isdescribed. The Weibull distribution, first introduced in 1939 by Swedishengineer W. Weibull, builds on empirical grounds in statistical theoryof the strength of materials. The Weibull distribution includes threeparameters, which are described below. An advantage of the Weibulldistribution is that its kernel shape can be controlled by selectingdifferent parameters for the gray levels of the input volume data.

The following description discusses spatially distributed objects, theWeibull distribution and the Weibull noise index, as well as how to usethe Weibull distribution the along with our algorithms. 3D results fromcontrol data and two real CLSM volume data sets are presented.

Spatially Distributed Objects.

Consider volumetric data V, where the intensity of each image pointp=(i,j,k)εV is given by ν(i, j, k) or ν(p), whose size isN=N_(x)×N_(y)×N_(z). As used herein, distribution means the distributionof ν over a certain interval [a,b](a>0). The random variable X_(Ω)(v) isthe number of points in a region Ω⊂V which have the value v, written asX_(Ω). The density or frequency ƒ_(Ω)(ν) of a random variable X_(Ω) isdefined as follows: $\begin{matrix}{{f_{\Omega}\left( v_{0} \right)} = \frac{{\Omega_{v_{0}}}}{\Omega }} & (1)\end{matrix}$where Ω_(v) ₀ {(i, j, k)εΩ|v(i, j, k)=v₀}, and |Ω| denotes the number ofelements in Ω.

Assume that a homogeneous segment can be mathematically specified usingtwo criteria: (1) the relative constant of regional expectancy and (2)regional variance of the intensity. These criteria are expressed asfollows:

Definition 1.

A region K is called as a spatially distributed object (SDO), if theexpectancy and standard deviation for each K-voxel A in K are relativelyconstant, i.e.E[X _(Δ)]ε(e₁,e₂), and SD[X _(Δ)]ε(d ₁ ,d ₂),  (2)

where e₁, e₂, d₁, and d₂ denote predefined constants, the randomvariable X_(Δ) is defined as X_(Ω) above.

In general, the expressions of expectancy and standard deviation in aκ-voxel are given as follows: $\begin{matrix}{{{E\left\lbrack X_{\Delta} \right\rbrack} = {\frac{1}{\Delta }{\sum\limits_{{({x,y,z})} \in \Delta}{v\left( {x,y,z} \right)}}}},{{{and}\quad{{SD}\left\lbrack X_{\Delta} \right\rbrack}} = \sqrt{{\frac{1}{\Delta }{\sum\limits_{{({x,y,z})} \in \Delta}{v^{2}\left( {x,y,z} \right)}}} - {E^{2}\left\lbrack X_{\Delta} \right\rbrack}}},} & (3)\end{matrix}$where |Δ| denotes the number of elements in Δ. The SDOs are also called“agents” and regions of interest (ROI's). The goal of segmentation is tolocate SDO's. The choice of e₁, e₂, d₁, and d₂ depends on the E-SDfield.

However, if noise is present then equation (3) will not give accurateE-SD values. CLSM data which inherently includes noise and has a poorsignal to noise ratio resulting in these inaccurate the E-SD values.

Confocal Laser Scanning Microscopy (CLSM) is a technique for obtaininghigh resolution scans of optical slices through a thick specimen withouthaving to cut the specimen mechanically. Due to the precise lenses, thehigh coherency of the illuminating laser beam, and the confocal way ofgathering backscattered light, accurate focusing at specific planarlocations can be achieved. A typical optical section is between 0.1˜100μm. Scanning through the whole specimen thereby gives a full 3Dprojection view of the specimen. This technique is very useful not onlybecause it allows the volumetric analysis of biological data, but alsobecause the techniques used in “staining” these specimens (i.e. laserexcited dyes) increase the accuracy of these images as compared toimages obtained from ordinary optical microscopes. Nevertheless, imagesare still noisy and blurred. Several sources of noise can be identified.These include (i) thermal noise induced by the photomultiplier, (ii)photon-shot-noise, (iii) biological background (autofluorescence), and(iv) unspecific staining. The quality of the image can be affected by apossible mismatch of refractive indices, tissue scattering, dyeconcentration inside the cell, and the histological methods used forstaining. These factors contribute to a position-dependent noise andblurring, which makes the analysis of these images rather difficult.

Statistical theory has been used for segmenting medical and biologicaldata. This assumes that the data follows a distribution. Intensityvalues in a region have been assumed to follow a Gaussian distribution.The Gaussian, Rayleigh, and Poisson distributions have been discussedseparately. In our paper, before attempting to segment volume data usingstatistical theory, we first analyze the distribution. FIG. 47B showsthe distribution of a CLSM data (see FIG. 47A. The plot 600 shows thedistribution of the complete volume data (see FIG. 47A, which looks likethe Poisson distribution. The plot 602 shows the distribution at thebrightest region in FIG. 47A. This looks like a Gaussian distribution.The plot 604 illustrates the distribution in a 4-voxel.

Weibull Distribution

Weibull distribution is defined as follows: $\begin{matrix}{{{p(v)} = {\frac{a}{b}\left( \frac{v - v_{0}}{b} \right)^{a - 1}{\exp\left\lbrack {- \left( \frac{v - v_{0}}{b} \right)^{a}} \right\rbrack}}},} & (4)\end{matrix}$where v≧v₀, a>0 is the shape parameter, b>0 is the scale parameter, andv₀ is the shift parameter (the minimum possible value of the randomvariable). In the CLSM data, the minimum possible density value is zero,i.e. v₀=0. Therefore, it is assumed that the shift parameter of theWeibull distribution v₀=0. FIG. 48 gives the Weibull distribution (4)with different shape parameters a and scale parameter b=1.2 and v₀=0.The expectancy and the deviation of the random variable Xobeying theWeibull distribution are given by: $\begin{matrix}{{{E\lbrack X\rbrack} = {{b\quad{\Gamma\left( {1 + \frac{1}{a}} \right)}} + v_{0}}},{{{and}\quad{{SD}^{2}\lbrack X\rbrack}} = {b^{2}\left\lbrack {{\Gamma\left( {1 + \frac{2}{a}} \right)} - {\Gamma^{2}\left( {1 + \frac{1}{a}} \right)}} \right\rbrack}},} & (5)\end{matrix}$where the gamma function is Γ(α) = ∫₀^(∞)t^(α − 1)𝕖^(−t)  𝕕t.It can be shown that when a=1.0, it is the Possion pdf; and when a=2.0,one has the Rayleigh pdf, and when a=3.0, it turns into the Gaussianpdf. When a>>1, the distribution tends to be a uniform pdf. Therefore,the Weibull model is a suitable model to fit the histogram distributionof these volume data and the regions within them whose statisticalproperties are unknown since the kernel shape can be controlled byselecting a different a value.

From equation (5), it is clear that the parameters a and b of theWeibull distribution depend on the expectancy and the standarddeviation. We denote the ratio r=SD/E in equation (5) with v₀=0, and therelationship between r and the shape parameter a of its Weibulldistribution is as follows: $\begin{matrix}{{{r^{2} = {\frac{{SD}^{2}\lbrack X\rbrack}{E^{2}\lbrack X\rbrack} = {\frac{\Gamma\left( {1 + {2/a}} \right)}{\Gamma^{2}\left( {1 + {1/a}} \right)} - 1}}},{or}}{{\frac{1}{r^{2} + 1} = {\frac{\Gamma^{2}\left( {1/a} \right)}{2a\quad{\Gamma\left( {2/a} \right)}} = {{\frac{1}{2a}{B\left( {{1/a},{1/a}} \right)}} = {\frac{1}{2}{{tB}\left( {t,t} \right)}}}}},}} & (6)\end{matrix}$where the Beta function is B(x, y) = ∫₀¹t^(x − 1)(1 − t)^(y − 1)  𝕕t,a is the shape parameter of the WD and t=1/a. From equation (6), it canbe seen that the shape parameter of the Weibull distribution is onlydependent on ratio r in the E-SD plot. When t≈0.42, the RHS of equation(6) reaches its maximum, which is near the value 0.72. Unfortunately,the RHS of equation (6) is not a monotonic function of t (see FIG. 49).For each r, there are two roots. In order to overcome this difficulty,we first give some properties of the Weibull distribution:

Property 1

For every s>0, the s-moment of Weibull distribution is:EX ^(s) =b ^(s)Γ(1+s/a).

Property 2:

If X₁,X₂, . . . , X_(n) are independent distribution random variables,and follow the Weibull law, then${{\frac{1}{n}{\sum\limits_{1}^{n}{\left. X_{i}^{s}\longrightarrow{EX}^{s} \right.\quad{for}\quad 1}}} \leq s < \infty},\left. {{as}\quad n}\rightarrow{\infty.} \right.$

Weibull Noise Index

Removing noise or improving the signal-to-noise ratio (SNR) of a givenimage is an essential step in segmentation, especially in high noisesituations that can disrupt the shape and lose the edge information thatdefines the structure of objects. The traditional algorithms ofdenoising, such as Gaussian filter, reduce the noise but they do notmaintain the edge information. When noise is removed, it is desirable,not only to smooth all of the homogenous regions that contain noise, butalso to keep the position of boundaries, i.e., not to lose the edgeinformation that defines the structure of objects.

Let ν₁, ν₂, . . . , ν_(κ) ³ represent κ³ image points in a givenK-voxel. It is assumed that the value of a voxel is characterized by theWeibull distribution. If equation (3) is used to calculate the E-SDvalue, the results are not reliable due to noise, especially for astandard deviation. Therefore, one must find a way to distinguishwhether or not the data distribution in a K-voexl is uniform. If it isnot uniform, then what kind of noise is present? If few of the elementsin a voxel are significantly larger and/or smaller than others, thenthese are called upper/lower noise. For example, in a 2-voxel, in whichthe set of the intensity at eight image points is {234, 52, 64, 46, 50,54, 62, 3}, the element 234 is much larger than others, and is called anupper noise. The element 3 is significantly less than others, and iscalled a lower noise. In order to classify the noise in a κ-voxel, anauxiliary function g(s) is introduced: $\begin{matrix}{{{g(s)} = {\frac{1}{n}\frac{\left( {\sum\limits_{1}^{n}v_{i}^{s}} \right)^{2}}{\sum\limits_{1}^{n}v_{i}^{2s}}}},} & (7)\end{matrix}$where sε(−∞, ∞), v_(t)>0, 1≦i≦n, and n=κ³. By calculating equation (7)directly, we have the derivative of the function g(s) that satisfies:g′(s)≦0 for s>0, g′(s)≧0 for s<0, and g′(s)=0 if and only if v₁=v₂=. . .=v_(n). Also:${{g(0)} = 1},{{g(\infty)} = \frac{k_{1}}{n}},\quad{{{and}\quad{g\left( {- \infty} \right)}} = \frac{k_{2}}{n}},$where k₁ is the number of the elements which are equal to the maximum,and k₂ is the number of the elements which are equal to the minimum. Itis obvious that 1≦k₁,k₂≦n, and k₁+k₂≦n.

Using Property 2, we know that${{{g(s)} \approx \frac{\left( {EX}^{\quad s} \right)^{2}}{{EX}^{\quad{2s}}}} = \frac{t_{s}{B\left( {t_{s},t_{s}} \right)}}{2}},$where t_(s)=s/a, and a is the Weibull distribution shape parameter inthe κ-voxel. From the analysis above, the function t_(s)B(t_(s),t_(s))/2reaches its unique positive maximum near 0.72 at t≈0.42. If k₁ and k₂are small enough (k₁,k₂≦└0.14n┘^(†)) and there is a so such thatg(s₀)=0.72, then we have^(†)└x† is the maximum integer which is less than x.a≈s₀/0.42  (8)

Definition 2:

If s₁>0 such that g(s₁)=0.72 (if k₁>0.72 n, set s₁=∞), then s₁>0 iscalled the Weibull upper noise index. If s₂>0 such that g(−s₂)=0.72 (ifk₂>0.72 n, set s₂=−∞), then s₂>0 is called the Weibull lower noiseindex. In short, they are called the Weibull noise indices.

These two parameters are used to determine the “goodness” of voxeldistribution as follows (see FIG. 50):.

-   -   For a κ-voxel, if the Weibull upper noise index s₁<1.26 and the        lower noise index s₂>1.26, then there is upper noise in it.    -   For a κ-voxel, if the Weibull upper noise index s₁>1.26 and the        lower noise index s₂<1.26, then there is lower noise in it.    -   For a κ-voxel, if the Weibull upper noise index s₁<1.26 and the        lower noise index s₂<1.26, then there is upper and lower noise        in it.        Segmentation Algorithm        Based on the analysis above, the algorithm for volume data        segmentation is as follows:    -   Step 1:

Given a κ to determine the size of κ-voxel, initialize the SDO'spredefined constant in equation (2): e₂>e₁>0,d₂>d₁>0, and the thresholdof expectancy T_(c)>0.

-   -   Step 2:

Consider the jth κ-voxel. Use bisection to compute its Weibull noiseindex s₁, and s₂ which are the roots of the equation g(s)=0.72, whereg(s) is defined by equation (7). If there is upper noise or lower noiseor both, then remove the noise directly (i.e. delete the minimum or themaximum or both). Repeat at most └Cκ³┘times to execute Step 2;

-   -   Step 3

Calculate E-SD values using equation (3). If the expectancy is largerthan the threshold T_(c), add the κ-voxel to list A. If there are someκ-voxels which have not been dealt with, then go to Step 2.

-   -   Step 4:

Compute the frequency of the voxel in the list A, and create the E-SDplot. By using initial E-SD values e₂>e₁>0, d₂>d₁≧0, select the voxelwhich satisfies: e₂≧E≧e₁, d₂≧D≧d₁.

In this algorithm, the threshold T_(e) is used for controlling the sizeof list A above, and will cause the image to be rendered faster. Theconstant C in Step 2 is equal to 0.14. E-SD values (e₂, e₁; d₂, d₁) areobtained by moving a rectangle, called a window in the E-SD field,through user interaction.

The algorithm described above is simple and efficient. Its averagecomplexity is O(L log L), where L is the number of boxes, defined byL=N/(κ³) where N is the number of points in the volume data. As will beapparent, if the grains (or κ-voxels) are coarser (i.e. κ is larger),then the algorithm is more efficient; however, the results of thesegmentation will be coarser also. With different κ(κ=κ_(min), . . . ,κ_(max), κ_(min) and κ_(max) are prespecified), the selected κ best fitsthe given volume data. By this fitting approach, the number of SDOs involume data is determined. The minimum K, which shows the number of SDOsin E-SD field, will be chosen.

EXAMPLES

In this section we will look at two examples illustrating the proposedmethod for segmentation. The first example examines artificial volumedata generated using a Weibull distributed random number with differentparameters. The second example uses real CLSM data from two differentdata sets, and demonstrates how this model can be used to segment suchdata. The hardware we used is a Dell Precision workstation 330, with P41.4 GHz CPU, and 1-GB RAM.

Controlled Experiment

In order to make the experiment comparable, a controlled experiment isdone in the following way: we first segment a noise-free volume data(see FIG. 52A) and treat that segmentation as our reference, whichincludes a torus, an ellipsoid, and two deformed cubes of differentsizes but with the same shape parameter a, in a 100×100×100 cube (seeFIG. 52B). Every instance of a volume data with added noise is thendenoised using a Gaussian filter with σ=1.3, and a Weibull noise indexwith 2-voxels or 3-voxels. The targets are then segmented from theseimages and compared to the reference objects. The comparison, based onthe segmented volume, is done by identifiing the support function of thereference object and of the object segmented from a volume data withadded noise, denoted by S_(r) and S_(n), respectively. That is S₈₆(x)=1or 0, if x is in the segmented objects or not, where ξ={r,n}. Then, thevolume deviation (error) of S_(n) from S_(r) in the volume data V isdefined as follows:${\delta\left( {S_{n},S_{r}} \right)} = {\frac{\sum\limits_{x \in V}{{{S_{r}(x)} - {S_{n}(x)}}}}{\sum\limits_{x \in V}{S_{r}(x)}}.}$

These deviations are calculated to produce numbers that are comparableacross different noise levels. Several levels of noise have been addedto the test volume data to show the robustness of the filter. The noisethat is added to every image point is a Weibull distribution withdifferent scale parameters b: Y=min(255,C[−bln(1−X)]^(1/a)), whererandom variable X is the uniform distribution, a is the Weibull shapeparameter and C is a constant for each object. Finally, the volume errorfrom the simulated volume data is plotted in FIG. 51. As depicted inFIG. 51, the volume error corresponding to the Weibull noise index issignificantly lower compared to those that result from applying aGaussian filter. Segmentation results are shown in FIG. 52.

FIG. 6(i) shows the Weibull E-SD field of the noise-added volume datawith the scale parameter b=10.0. Below we define where the colors in anE-SD field correspond to frequency. We denote by N(e,d) the number ofthe κ-voxels whose expectancy is e and standard deviation is d. Thecolor at point (e, d) in the E-SD field is determined by log(N(e, d)).We set the color at point (e, d) as follows:${{color}\left( {e,d} \right)} = \left\{ \begin{matrix}{{{RGB}\left( {255,255,255} \right)},} & {0 \leq {\log\quad\left( {N\left( {e,d} \right)} \right)} < 0.5} \\{{{RGB}\left( {255,0,0} \right)},} & {0.5 \leq {\log\quad\left( {N\left( {e,d} \right)} \right)} < 1.0} \\{{{RGB}\left( {0,255,0} \right)},} & {1.0 \leq {\log\quad\left( {N\left( {e,d} \right)} \right)} < 1.5} \\{{{RGB}\left( {0,0,255} \right)},} & {1.5 \leq {\log\quad\left( {N\left( {e,d} \right)} \right)} < 2.0} \\{{{RGB}\left( {255,0,255} \right)},} & {2.0 \leq {\log\quad\left( {N\left( {e,d} \right)} \right)} < 2.5} \\{{{RGB}\left( {0,255,255} \right)},} & {2.5 \leq {\log\quad\left( {N\left( {e,d} \right)} \right)} < 3.0} \\{{{RGB}\left( {255,255,0} \right)},} & {3.0 \leq {\log\quad\left( {N\left( {e,d} \right)} \right)}}\end{matrix} \right.$

The colors in the E-SD field below have the same meanings. Next, therectangle, called a window (see FIG. 6(i)), has two small circles at itsleft-top and right-bottom vertex, which give the values (e₂,e₁;d₂,d₁) todefine the range of expectancy and standard deviation of a SDO.

FIG. 6(c) shows a slice of the noise volume data with the Weibull scaleparameter b=5.0. The segmentation results, using our method andthreshold method with Gaussian filter, are given in FIG. 6(d) and 6(e).Although it has lost some detail information, such as the deformation inthe two cubes compared with the reference FIG. 6(b), the segmentationusing our method keeps the number¹ and shape of objects. In contrast,segmentation performed by using threshold methods with Gaussian filterlost the number and shape of the objects.¹ Different components are colored using different colors.

FIG. 6(f) shows a slice of the noise volume data with the Weibull scaleparameter b=10.0, and the segmentation results from this noise datausing our method and the threshold method with a Gaussian filter aregiven in FIG. 6(g) and 6(h). FIG. 6(i) shows the Weibull E-SD field ofthis noise volume data, and illustrates that the three differentcomponents present. Using our segmentation method maintains the numberand shape of the objects, but using threshold techniques with Gaussianfilter fails in segmenting the objects.

CLSM Data

In this example we use data from two different CLSM data sets designedfor investigating the meiotic spindle within a mouse egg. The eggs wereviewed on the Leica TCS NT confocal microscope. Multiple lasers allowfor simultaneous imaging of the DAPI ( Argon TV [363 nm]) and ALEXA 568Krypton[568 nm]) fluorophore-labeled samples. Using a 63× waterobjective, images were scanned between 0.2-0.4 μm intervals along thez-axis, 0.154 μM along the x/y-axis and collected through the volume ofthe entire egg.

FIG. 7 shows a CLSM test bed for our Weibull E-SD modeling scheme fromdifferent experiments. The data in FIG. 7(a) is collected using aKrypton laser and highlights regions targeted with an antibody toanti-α-tubulin at the upper left and brighter regions through the egg.The image in FIG. 7(a) shows a meiotic metaphase II arrested egg, and iscomposed of 124 slices within a 512×512 matrix, contains a gray-levelfrom 0 to 255. The size of the data shown in FIG. 7(b) is 512×512×146,and has the same gray-level range as in FIG. 7(a). In FIG. 7(b) the egghas a meiotic spindle at the upper left and the remains of the primarypolar body at the bottom right.

FIG. 8 shows the E-SD plots of FIG. 9. The colors have the same meaningsas in FIG. 6(i). In FIG. 8(a), the size of the window is (170, 237; 4,27) corresponding to the segmentation of the data shown in FIG. 9(a). InFIG. 8(b), the size of the window is (183, 255; 2, 30) corresponding tothe segmentation of the data shown in FIG. 9(b). We set the thresholdat______=34, and κ=2. The segmentation in FIG. 9(a) includes 3902voxels, and 11308 voxels in FIG. 9(b).

FIG. 7. Test bed for our Weibull E-SD modeling scheme. The images comefrom two different data sets. In both images there is only one cell andits spindle is at the up part of the cell (labeled by biologist). Thesize of image (a) is______, its gray-level is from 0 to 255. The size ofimage (b) is 512×512×146, it has the same gray-level as (a).

FIG. 8. The Weibull E-SD fields from the data shown in FIG. 7. Thecolors have the same mean as in FIG. 6(i). In 8(a), the size of windowis (178, 237; 4, 27) corresponding to the segmentation shown in FIG.9(a). The threshold is______34, which creates a blank on the left side.There are two clear regions. In 8(b), the size of window is (167,255; 2,30) corresponding to the segmentation shown in FIG. 9(b). The thresholdis______34, which creates a blank on the left side. There are two clearregions too.

FIG. 9. The spindle segmentation corresponding to FIG. 7. Thesegmentation in 9(a) includes 3902 boxes, and 11308 boxes in 9(b).

What gives rise to such clear regions in Weibull E-SD field shown inFIG. 8 is unknown. A plausible explanation is that each regioncorresponds to a SDO. When we move the window on FIG. 8(a) to the smallregion, an area of cell perimeter in FIG. 7(a) is segmented (see FIG.10).

Conclusion

We have proposed a coarse-grain approach for the segmentation of anobject from volume data based on the Weibull E-SD field. We have shownthat one can decide the noise index in a K-voxel by using the Weibulllaw, and use the E-SD field as a guide to segment an object. We haveconsistently demonstrated this approach on controlled as well as on realvolume data.

Statistical 3D Segmentation with Greedy Connected Component LabelingRefinement

A new approach for segmenting 3D voxel data sets will now be described.It is semi-automatic and consists of two phases. First, the initialsegmentation is accomplished by voxel labeling using statistical maximumlikelihood estimation techniques. The novelty of the present approach isthat the type probability distribution function is not required apriori. A multi-parameter distribution that includes a variety of widelyapplicable distributions is utilized. The second phase refines thesegmentation by the use of a new greedy connected component labeling(GCCL). The overall effectiveness of this new approach is illustratedwith data examples of multichan-nel laser scanning confocal microscope(LSCM) images where the structure of GFAP (a protein) and nuclei, andthe geometry of an electrode implant are extracted.

Multichannel laser scanning confocal microscopy (LSCM) equipped with aspecific laser/filter combination to collect multiple fluorescencesignals in a single scan is applied increasingly in experimentalbiological investigation. In this application, the confocal 3D imagesare collected from two channels. The stack of 2D images (each 2D imagereferred to as a “slice”) make up the volumetric data. A typical sectionis from 10˜φmm. It is composed of different shaped structures, forexample, tree-shaped GFAP (a protein expressed by astrocyte cells), andblob-shaped nuclei (labeled with a DAPI stain) and sheet-shapedelectrode implants. The segmentation of these regions can yieldimportant biological information. For example, segmentation andsubsegment determination of GFAP volume allows for quantitative analysisof the immune response to the implanted electrode. In LSCM inparticular, measurement of the relative positions of regions labeledwith different cells/implants can provide an insight into theirinter-functional relationship. Due to image noise and different shapesof the cells/implants to be segmented, considerable effort is requiredto develop accurate image segmentation for localizing the structures.

Recently, several probabilistic frameworks to determine algorithms forthe segmentation in an image have been given. It is now well known thatto obtain efficient algorithms, the statistical properties of voxelsshould be taken into account. Members of the exponential family includesGaussian, Gamma, Rayleigh, Poisson, and other familiar distributionsthat have been used to model real data. Indeed, there are several kindsof 2D images for which the pixel values are correctly described by suchstatistical law. Gaussian distribution is widely used to characterizethe probabilistic feature of random variables of an image. omeultrasonic medical images have been modeled using Rayleigh's law.Astronomical images have been presented using Poisson distribution andcompared to the classical linear intercorrelation technique. Previously,it has been assumed that the probability density function (pdƒ) of therandom fields, which describe the gray levels in the input image,belongs to the exponential family. However, a pdf with a singledistribution shape will limit the use of the segmentation approach, andasks users to distinguish the different pdfs of input images. Inaccordance with the present, we use the Weibull pdf, whose kernel shapecan be controlled by selecting different parameters.

When a statistical model-based method is used, both voxel and itsneighbors should be considered, to estimate the parameters of the pdffor each voxel. Then, a histogram, called the Weibull a-b histogram, isgenerated. Using voxel labeling guided by the Weibull a-b histogram, theinitial segmentation result can be obtained. However, due to theunavoidable noise in LSCM images, the initial segmentation is quitecoarse, e.g. this results in many isolated small regions. In order toovercome the problem, we introduce a new algorithm of connectedcomponent labeling (CCL), called Greedy Connected Component Labeling(GCCL), to delete the unwanted small regions. CCL algorithms segment thedomain of a binary image into partition that corresponds to connectedcomponents. CCL is one of the fundamental segmentation techniques usedin image processing. The classical CCL algorithms pay more attention tothe cost of memory but not time complexity. The most common approach toavoid making several passes over the image is to process the image in asequential order and keep an equivalence table that stores theinformation about the equivalence of labels as-signed to the voxels fromthe same component. Others have used two passes in which the image isscanned in raster order, and the size of equivalence table is bounded toO(7N) for a N×N 2D raster image. Due to the increase in the size of a 3Dimage, GCCL takes into account the computation time cost as well as thesmaller size of the equivalence table. In the GCCL algorithm accordingto the present invention, one pass for a 3D image is used, the size oflocal equivalence table is O(the width of connected component), and thetime cost is O(N) with low constant, where N is the number of voxels inthe image.

The algorithm according to the present invention has been tested usingLSCM volume data shown in FIG. 60. FIG. GR1(a) shows a LSCM image ofGFAP-labeled astrocytes and DAPI stained nuclei, in which the greenindicates regions targeted with GFAP-labeled astrocytes, and the redregions identify DAPI, which consistently targets nuclei cells. FIG.1(b) shows a GFAP and implant LSCM image, in which the red targetselectrode implant devices and the green GFAP. FIG. 60A and 60B arerendered using maximum intensity projection (MIP).

Statistical Image Model

We will now introduce Weibull statistical modeling and describe how tocreate a Weibull a-b histogram and how to obtain an initialsegmentation.

Weibull Modeling

Consider a 3D image I organized as a stack of 2D transverse slices, andthe grid point set is denoted by I, whose size is N_(x)×N_(y)×N₂ wherethe intensity of each voxel (i, j, k)εI is given by v(i; j; k). TheWeibull probability density function (pdƒ) of gray level of each voxelis given by $\begin{matrix}{{{p(v)} = {\frac{a}{b}\left( \frac{v - v_{0}}{b} \right)^{a - 1}{\exp\quad\left\lbrack {- \left( \frac{v - v_{0}}{b} \right)^{a}} \right\rbrack}}},} & (1)\end{matrix}$where v≧v₀; a≧0, b≧0; v₀≧0, and v is the gray level of the voxel, a is ashape parameter, b is a scale parameter, and v₀ is a shift parameter(the minimum possible value of the random variable). This triparametricdistribution was introduced in 1939 by Swedish engineer, W Weibull, onempirical grounds in the statistical theory of the strength ofmaterials. It can be shown that when a=1.0, it is the Possion pdƒ, whena=2.0, one has the Rayleigh pdf; and when a=3.0, it turns to theGaussian pdf. When a>>1, the distribution tends to a uniform pdf.Therefore, the Weibull model is a suitable model to fit the histogramdistribution of these images and the regions in them whose statisticalproperties are unknown since the kernel shape can be controlled byselecting a different a value. FIG. 61 shows the Weibull distributionequation (1) with different shape parameters a, the scale parameterb=1.2 and the shift v₀=0. For LSCM imaging, the minimum possibleintensity value is zero, i.e. v₀=0. Therefore, in the discussion thatfollws, we assume that the shift parameter of the Weibull distributionsatisfies v₀=0.

Assume that the observed image I is composed of two zones: the targethaving one (or several) simple connected region(s) and the background.Under this assumption, the target in the image is completely defined bya binary image T={T(i, j, k)|(i, j, k)εI} that defines a certain shapefor the target so that T (i, j, k) is equal to one within the target andto zero elsewhere. Thus, the target in the image is the region:Ω_(T)={(i, j, k)εI|T(i, j, k)=1}.

The purpose of segmentation is, therefore, to estimate the binary imageT for the target in the image. Without more a priori knowledge about thetarget, the maximum likelihood estimation is first introduced to computethe Weibull parameters.

Maximum Likelihood Estimation

The parameter estimates obtained using the maximum likelihood techniqueare unique, and as the size of the sample increases, the estimatesstatistically approach the true values of the sample. Let ν1; ν2; ¢¢¢;vn represent n voxels of image. It is assumed that the in-tensity ofeach voxel is characterized by the Weibull distribution. The likelihoodfunction associated with the sample is the joint den-sity of n randomvariables, and thus is a function of the unknown Weibull distributionparameters (a; b). The likelihood function for a sample under theseassumptions is given by the expression $\begin{matrix}{{L\left( {a,b} \right)} = {\prod\limits_{i = 1}^{n}{\left( \frac{a}{b} \right)\left( \frac{v_{i}}{b} \right)\exp\quad{\left( {- \frac{v_{i}}{b}} \right)^{a}.}}}} & (2)\end{matrix}$

The parameter estimates are determined by taking the partial derivativesof the logarithm of the likelihood function with respect to a and b andequating the resulting expressions to zero. The system of equationsobtained by differentiating the log likelihood function for a sample isgiven by $\begin{matrix}{{{{g(a)} \doteq {\frac{\sum\limits_{i = 1}^{n}{v_{i}^{a}{\ln\left( v_{i} \right)}}}{\sum\limits_{i = 1}^{n}v_{i}^{a}} - {\frac{1}{n}{\sum\limits_{i = 1}^{n}{\ln\left( v_{i} \right)}}} - \frac{1}{a}}} = 0},{b = \left\lbrack {\frac{1}{n}\left( {\sum\limits_{i = 1}^{n}v_{i}^{a}} \right)} \right\rbrack^{1/a}}} & (3)\end{matrix}$

In order to find the parameter a from equations (3), we first in-troduce a function g(a) defined by equation (3). For any sample v1;v2;¢¢¢ ;vn with vi>0(1·i·n), we can directly compute that the derivativethat satisfies g 0 (a)>0, and so g(a) is a monotonic as-cending functionwith shape parameter a, and lim a ! 0+g(a)=_(i)¥, and lima!□□g(a)=kln(max(vi))_(i) 1 n □□i=1 ln(vi)>0, where k, 1 is the number ofthe elements, which reach the maximum in the set fv1; v2; ¢¢¢ ;vn g.Therefore, for any sample v1;v2; ¢¢¢;vn with vi>0(1·i·n), g(a) has oneand only one positive root. FIG. 3 shows the plot of function g(a) fordifferent samples. Once a is determined, this value is inserted intoequation (3) and b is cal-culated directly. From equation (3), it iseasy to see that b is the a-moment of sample v1;v2; ¢¢¢;vn withvi>0(1·i·n), and a indicates the deviation of this sample. The lessdeviation, the larger the root of equation g(a)=0 (see FIG. 61). By theanalysis above, we can use the bisection algorithm to solve equations(3) and get the Weibull parameters.

Voxel Labeling by Weibull a-b Histogram

Once the Weibull model is obtained, the segmentation problem amounts toassigning labels to each voxel in the volume. A straight-forward way isto label voxels as the target or the background by maximizing theindividual likelihood function. This approach is called ML classifier,which is equivalent to a multiple thresholding method. Usually, thismethod may not achieve good performance since there is a lack of localneighborhood information to be used to make a good decision. Therefore,we incorporate the local neighborhood information at a voxel into alabeling procedure, thus improving the segmentation performance. Supposev i; j;k is the intensity of a voxel (i; j;k) 2I in the image I, w i;j;kis a size d£d£d window centered at (i; j;k) for maximum likelihoodestimation, where d is an odd integer greater than 1. We regard w i; j;kas the local region while we calculate the Weibull parameters for thevoxel (i; j;k) using equation (3).

When the intensity value ranges from 0 to 255, then the value of theWeibull scale parameter b is also from 0 to 255 by equation (3). Onother hand, the more uniform the region surrounding a voxel is, thelarger the Weibull shape parameter a becomes. Experimentally, we set theupper boundary of the Weibull shape parameter a to be 100. The size ofthe window has influence on the calculation of the Weibull parameters.The window should be large enough to allow enough local information tobe involved in the computation of the Weibull parameters for the voxel.Furthermore, using a larger window in the computation of the parametersincreases the smoothing effect. However, smoothing the local area mighthide some abrupt changes in the local region. Also, a large window mayrequire sig-nificant processing time. Weighing the pros and cons, wechoose a 3×3×3 or 5×5×5 window for estimating the Weibull parameters andhave experimental data to back that choice.

A classical histogram is a statistical graph counting the frequency ofoccurrence of each gray level in the image or in part of an image. Weextend this idea and define a histogram in the Weibull a-b domain.First, the Weibull shape parameter a and scale parameter b for eachvoxel are calculated. Second, for each Weibull parameter pair (a;b) 2[0;100]£[0;255], count the number of voxels with this parameter pair.Here two issues arise in computing the frequency for each parameterpair. One is that the step of the Weibull a-b domain is (1,1). The otheris that we set a low boundary for the scale parameter b, since b is thea-moment of the intensity sample surrounding a voxel. We should identifythe target with higher intensity, not the background with lowerintensity. Last, we have the frequency for each parameter pairlogaritluized, and plotted against that pair, and colored as the legendshown in FIG. 62. The Weibull a-b histogram gives us a globaldescription of the distribution of the uniform regions across intensitylevels. We use a movable rectangle in the histogram to locate the rangeof the colored peak zone by moving its top-left or bottom-right vertex,and select all the voxels with the Weibull a-b parameter histogram beingin this rectangle.

An advantage of segmentation using the Weibull a-b parameter histogramis that local information and global information are both taken intoaccount in determining the segmentation, whereas in traditionalhistogram approaches only global information is con-sidered.

Segmentation Refinement using GCCL

Although LSCM images have a much high accuracy, these images are stillnoisy and blurred. Several sources of noise can be identified: thermalnoise induced by the photomultiplier, photon-shot-noise, biologicalbackground (autofluorescence), and unspecific staining. The quality ofthe image depends also on possible mismatch of refractive indices, ontissue scattering, on dye concentration inside the cell, and thehistological methods used for staining methods. These factors contributeto a position-dependent noise and blurring. Therefore, segmentationresults using voxel labelling by the Weibull a-b histogram can be quitecoarse and may lead to the problem that there are many isolated smallsegmented components, as shown in FIG. 64A. This is due to two reasons.First, a thresholding based segmentation is a binary representation of aregion that either includes or excludes a given voxel as being in or outof the region. Second, the voxel labelling cannot distinguish thetargets from the noise. On the other hand, voxel labelling can not showthe shape of cells and implants segmented and the relationship amongthem, because voxel labelling only uses the in-formation around a voxel.

We wish to correct these problems by deleting the unwanted small regionsand finding structural and quantitative information in an image usingconnected component labelling (CCL), which seg-ment the domain of abinary image into partitions that correspond to connected components.Here, two voxels are 6-connected if at most one of their 3D coordinatesdiffers by 1, 18-connected if at most two coordinates differ by 1, and26-connected if all three coordinates are allowed to differ. A6/18/26-connected path through the 3D image is a sequence of6/18/26-connected voxels. A 6/18/26-connected component in a binaryimage is a subset in which for every two voxels there is a6/18/26-connected path between them. The partitioning is represented byan 3D image in which all voxels that lie in the same connected componenthave the same voxel value. Distinct voxel values are assigned todistinctly connected component. It is clear that only the target voxelsare affected by this labelling i.e. the background voxels remainunchanged.

Hierarchy Frame for a Connected Component

For a connected component C, a hierarchy frame H C (r) rooted from voxelr is defined as a partitioning of C into hierarchy f h1(r); h2(r); ¢¢¢;hn(r) g satisfying 1.h1(r)=frg; 2. All voxels adjacent to voxels inhierarchy hi (r); (i=2; ¢¢¢; n_(i)1) are in hierarchies hi_(i) 1(r), hi(r) and hi+1(r); 3. All voxels adjacent to voxels in hierarchy hn(r) arein hierar-chies hn_(i) 1(r) and hn(r). where n is the total number ofhierarchies, and is simply the depth of C. maxi fj hi (r) jg is calledthe width of C, where j ¢ j denotes the number of the elements in a set.FIG. 5 shows a 2D 8-connected component which is labeled by the set f A;B; C; D; E; F; G; H; I; J; K; L; M; N; O; P; Q g. If the root is A, thenthe hierarchy frame for this connected component is h1(A)=fA g,h2(A)=fB; Eg, h3(A)=fC; F; Q; Eg, h4(A)=fD; Eg, h 5(A)=fH; Jg, h6(A)=fK;L; M g, and h7(A)=fP; O; Ng. Therefore, the depth of C is 7, and thewidth is 4.

GCCL

For a binary image T, when GCCL finds a unlabeled voxel, called a rootr, it does not stop until all voxels in WT connected to r though a6/18/26-connected path are labeled. The procedure of labelling aconnected component C by GCCL is to find the hierarchy frame H C (r)rooted from r. In our implementation of GCCL, we regard a connectedcomponent as a dynamic list (DL), and different com-ponents are assignedto different DLs. However, the main problem with GCCL is that GCCL mayrepeat to label a voxel in C. In order to avoid the expensive operationof DL, such as adding a unique node to a DL, we use 4-valued flags todetermine the value of a voxel in T. Let t(i; j; k) denote the value atvoxel (i; j; k) 2 T, and set t(i; j; k) to be either 0 or 1 or 2 or 3.If t(i; j; k)=0, then the voxel (i; j; k) belongs to the background andis not changed. When t(i; j; k)=1, the voxel (i; j; k) belongs to thetarget, it is not la-beled, and can be a root to form a new component ora new node to be added to the DL. When t(i; j; k)=2, the voxel (i; j; k)can be a new node to be added to the DL, but can not be a root to form anew component. When t(i; j; k)=3, it means that the voxel (i; j; k) hasbeen added to a DL, neither as a new root nor as a new node. The GCCLalgorithm is as follows:

GCCL Algorithm: Read T, and set t(i, j, k) = 1, if (i, j, k) ε Ω_(T),while t(i, j, k) = = 1 loop /* the first loop*/ Generate a DL, denotedby dl, to store the connected component whose root is (i, j, k), and atemporary DL, denoted by tdl, to store middle results. Two counters,denoted by num1 and num2, represent the increase of dl and tdl,respectively, and are initialized to zero. Add voxel (i, j, k) into tdl;while tdl ≠ Ø, and voxel (l, m, n) ε tdl loop /* the second loop*/ ift(l, m, n) = = 1 or 2 then while all the 6/18/26-connected voxels of (l,m, n), denoted by (l ± 1, m ± 1, n ± 1), and t(l ± 1, m ± 1, n ± 1) = 1loop /*the third loop*/ num1++; Add voxel (l ± 1, m ± 1, n ± 1) intotdl; Set t(l ± 1,m~ 1, m ± 1, n ± 1) = 2; End loop Set t(l, m, n) = 3;Add (l, m, n) into dl; num2++; End if Remove voxel t(l, m, n) from thetdl; if num1 ≠ num2 then Reset the tdl; End loop End loop

Here, four issues arise in the GCCL algorithm. One is that for aconnected component C, its root is the voxel (i0; j0; k0), where k0=min(i; j;k) 2 C k, j0=min (i; j;kO) 2 C j, and i0=min (i; j0;k0) 2 C i. Thesecond is that we use three primitive operations of DL:

Add(x): It creates a new node for containing a voxel. The node is thenpushed on the DL of voxel x at O(1) time cost.

Remove(x): This operation first moves the pointer of DL to the root ofDL, and then finds the voxel x in the DL. If x is in DL, then delete, Ifnot, go through the DL. In general, if there are n nodes in the DL, thisoperation costs O(n). In this algorithm, the voxel to be removed isalways the root of the DL, therefore, this operation has O(1) cost.TABLE 4 The efficiency of the GCCL algorithm Data Data Size ConnectedComponent Time (s) Amount Largest GFAP & Nuclei 29.9 MB 2514 87536 4GFAP & Implant 63.9 MB 3115 234976 8 Connected Component Data Data SizeAmount Largest Time (s) GFAP & Nuclei 29.9 MB 2514 87536 4 GFAP &Implant 63.9 MB 3115 234976 8

Reset: It reinitializes the DL and returns the root of the DL in O(1)cost. The third issue is that the number of times the first loop isexcuted in the GCCL algorithm is the number of the connected componentsin the image. The number of times in the second loop is dependent on thenumber of voxels in a connected component, and the third is a constanteither 6, 18 or 26. Therefore, the time complexity of the GCCL algorithmis O(N) with small constant, where N is the number of the voxels in theimage. The last issue is that the size of local equivalence table isO(the width of connected component), because each local equivalencetable is the DL tdl in GCCL algo-rithm. In FIG. 64, the right image isthe voxel labelling result by means of Weibull a-b histogram, includes2514 26-connected components, and the left one is the results of theGCCL. Table 4 shows GCCL on different data. This was run on PIV 1.7GHzwith 4GB RAM Window 2000. As shown in FIGS. 65A and 65B, the resultingsurface from our segmentation can be very coarse. This is due to thenoise in the images and thresholdingbased segmentation methods. Wesmooth connect components using convolution with Weibull kernel whoseparameters are determined by the equation (3). FIG. 65 shows thesmoothing results of connected components.

FIG. 60A shows LSCM volume data composed of 58 slices with an in-plane512×512 pixels, and 100×100×57:8 mm field of view. After computation ofthe Weibull a-b histogram shown in FIG. 62, we move the top-left vertexto (24, 0) and the bottom-right to (255, 100). After the initialsegmentation using voxel labelling by the Weibull a-b histogram, thecorresponding result TO is given in FIG. 64A. Using GCCL for TO andsetting the threshold of the number of voxels in a connected component200, we can get the segmentation results shown in FIG. 66. FIG. 67 showsthe effect of the top-left vertex on Weibull a-b his-togram on theconnection for glia cells in LSCM image FIG. 60B. By adjusting theposition of vertex on Weibull a-b histogram, the connections betweenadjacent astrocytes can be more clearly defined, allowing for thepotential identification of in-dividual astrocyte. FIG. 60A is obtained,when the top-left vertex is moved to (31,0), and FIG. 60B to (24,0).

As another example, we consider a GFAP and implant LSCM volume datashown in FIG. 64B. FIG. 64B is 127 slices with an in-plane 512×512matrix. FIG. 68 shows the segmentation results of the GFAP and implantdata. We have tested our algorithm with 69 different LSCM volume data.As illustrated here, we have observed that the overall approach can bevery effective at segmenting LSCM images.

In this paper, we have presented a new statistical modeling of vol-umedata to segment a target of interest, and a GCCL algorithm to refine theinitial segmentation from the Weibull statistical model-ing. Thismethod, as illustrated by pilot application in LSCM im-ages analysis, iscapable of segmenting the structures within data and can be applied toreal problems such as those encountered in tissue segmentation. One ofthe remaining limitations of the present approach is that it is stillsemi-automatic and consequently requires the intervention and expertiseof the user. It would be desirable to move in the direction of a morefully automatic segmentation procedure.

Conclusion

The above-described system and method of the present invention possessesnumerous advantages as described herein and in the referencedappendices. Additional advantages and modifications will readily occurto those skilled in the art. Therefore, the invention in its broaderaspects is not limited to the specific details, representative devices,and illustrative examples shown and described. Accordingly, departuresmay be made from such details without departing from the spirit or scopeof the general inventive concept.

1. A computer system for storing, archiving, querying and retrievinginformation relating to three-dimensional objects; the systemcomprising: data acquisition means for acquiring point coordinate dataabout the three-dimensional object; a database component; a processoroperable to: generate modeled data from the point coordinate data;segment the modeled data into feature data representing a plurality offeatures of the object; store the modeled data and the feature data inthe database component; and retrieve modeled data and feature data fromthe database using search criteria comprising representing an objectfeature; and a user interface operative with the processor to: input tothe processor search criteria; and display data retrieved by theprocessor as a representation of an object feature.
 2. The system ofclaim 1 wherein the point coordinate data is surface data.
 3. The systemof claim 1 wherein the point coordinate data is volume data.
 4. Thesystem of claim 1 wherein the feature data represents a point.
 5. Thesystem of claim 1 wherein the feature data represents a curve.
 6. Thesystem of claim 1 wherein the feature data represents a facet on asurface.
 7. The system of claim 1 wherein the processor is furtheroperable to compress the modeled data.
 8. The system of claim 7 whereinthe modeled data comprises a triangle mesh and the processor is operableto compress the modeled data using B-spline curves.
 9. The system ofclaim 7 wherein the modeled data comprises a triangle mesh and theprocessor is operable to segment the modeled data using subdivisionsurface compression.
 10. The system of claim 1 wherein the pointcoordinate data comprises a triangle mesh and the processor is operableto segment the modeled data using watershed segmentation methodincluding improved curvature estimation.
 11. The system of claim 1wherein the point coordinate data comprises a triangle mesh and theprocessor is operable to segment the modeled data using a hybridsegmentation method.
 12. The system of claim 1 wherein the pointcoordinate data comprises a triangle mesh and the processor is operableto segment the modeled data using watershed segmentation.
 13. The systemof claim 1 wherein the point coordinate data comprises volume data theprocessor is operable to segment the modeled data using Weibull E-SDfields.
 14. The system of claim 1 wherein the point coordinate datacomprises volume data and the processor is operable to segment themodeled data using Greedy connected component labeling refinement. 15.The system of claim 1 wherein the user interface includes a graphic userinterface that operates with the processor to allow a sketch-basedsearch of the database.
 16. A method for storing, archiving, queryingand retrieving information relating to 3D objects; the systemcomprising: acquiring point coordinate data from a physical object;generating modeled data from the point coordinate data; segmenting themodeled data into feature data representing a plurality of features ofthe physical object; storing the modeled data and the feature data in adatabase; and organizing the modeled data and the feature data so thatfeatures of the physical object can be automatically extracted foronline query and retrieval of the plurality of features of the physicalobject.
 17. The method of claim 16 wherein the point coordinate data issurface data.
 18. The method of claim 16 wherein the point coordinatedata is volume data.
 19. The method of claim 16 further comprisingcompressing the modeled data.
 20. The method of claim 19 wherein themodeled data comprises a triangle mesh and compressing the modeled datais performed using B-spline curves.
 21. The method of claim 19 whereinthe modeled data comprises a triangle mesh and compressing the modeleddata is performed using a subdivision surface compression method. 22.The method of claim 16 wherein the point coordinate data comprises atriangle mesh and segmenting the modeled data is performed usingwatershed segmentation.
 23. The method of claim 16 wherein the pointcoordinate data comprises a triangle mesh and segmenting the modeleddata is performed using a hybrid segmentation method.
 24. The method ofclaim 16 wherein the point coordinate data comprises volume data andsegmenting the modeled data is performed using Weibull E-SD fields. 25.The method of claim 16 wherein the point coordinate data comprisesvolume data and segmenting the modeled data is performed using Greedyconnected component labeling refinement.